[HN Gopher] OpenFlights - airport and airline data
       ___________________________________________________________________
        
       OpenFlights - airport and airline data
        
       Author : cyberlab
       Score  : 160 points
       Date   : 2021-04-27 14:25 UTC (8 hours ago)
        
 (HTM) web link (openflights.org)
 (TXT) w3m dump (openflights.org)
        
       | gbrindisi wrote:
       | does anyone know any current dataset I could query to check
       | historical routes status?
       | 
       | Since the pandemic I've found plenty of airlines selling tickets
       | and systematically cancel the flight a few days before. I was
       | looking to scrape some data to avoid this kind of unreliable
       | flights.
        
         | einpoklum wrote:
         | Check out my other comment...
         | 
         | The USDT on-time performance data goes back as far as October
         | 1987 (and you can specify the period to the download script
         | with the --first-year , --first-month , --last-year , --last-
         | month command-line switches).
         | 
         | Once the data is loaded you can use spiffy SQL to print out
         | routes the way you like them. Unfortunately the data is also a
         | bit dirty (which is something I'm working on).
        
       | JamilD wrote:
       | A few years ago I tried writing a Python wrapper around SABRE's
       | API to get pricing, route, and schedule data, which seemed to
       | work reasonably well. It likely doesn't work anymore, but it was
       | a fun exercise. https://github.com/Jamil/sabre_dev_studio
       | 
       | I wish I had access to the GDS data to get realtime seat/award
       | availability, but I couldn't find any pricing information to get
       | that information through Sabre's API.
       | 
       | Does anyone know how much that costs, or if there are any
       | services which provide it as an API? I use ExpertFlyer for
       | personal use, but ideally I'd want to get that information at the
       | source...
        
         | jsjohnst wrote:
         | I'd really like to know this too, but been unable to get a
         | price either. I also really want an API (ideally with
         | historical data available) with fare pricing data too, but not
         | been able to get a quote on that either.
        
           | alas44 wrote:
           | You can both take a look at Amadeus for dev (Amadeus is the
           | biggest GDS) https://developers.amadeus.com
           | 
           | More info on API here
           | https://github.com/amadeus4dev/hackathon-
           | starter/blob/master...
           | 
           | Disclaimer I work for Amadeus, but actually never used this
           | API service, I'd be interested in your feedback
        
             | jsjohnst wrote:
             | I'll look into this API, thanks! Everything I've tried
             | previously I ran into limitations that blocked me from
             | building the project I was working on.
        
       | kingsloi wrote:
       | For anyone interested, the nice guys at https://aviation-edge.com
       | supplied me access to their flight API so I can track how many
       | flights fly directly over my little community in Gary Indiana:
       | https://millerbeach.community
       | 
       | I wish I was able to track more frequently than every 15 minutes
       | (free version api max, etc), because some aircraft pass overhead
       | before they're picked up, so it's not the most accurate, but a
       | rough figure to/from O'Hare, Midway, and Gary
        
         | ant6n wrote:
         | I'm working on some project to compare historical
         | availabilities of seats between city pairs in Europe, too bad
         | their historical api doesnt return aircraft type (so number of
         | seats its unknown). I also couldn't find how far back their
         | data goes.
         | 
         | ... for my project, I actually got some historical paper
         | schedules of the official aviation guide, basically they're
         | phone books. I hope to find a decent/affordable database for
         | more recent data. (MIT students/alumns actually get access to a
         | database going back to 1979, but alas no access for
         | outsiders...)
        
           | notahacker wrote:
           | The Official Aviation Guide became OAG who will have what you
           | need in digital form but at steep commercial rates, as will
           | their competitor Innovata (Cirium)
           | 
           | The actual seats bit is surprisingly complex if you want
           | accurate figures, as the same aircraft type can have wildly
           | different numbers of seats depending on layout and class
           | configuration. OAG/Innovata's standard schedule product has
           | the aircraft variant _normally_ assigned to a route shown,
           | and they survey the airlines on the seating configurations of
           | their aircraft calculate capacity and ASKS. I believe Cirium
           | now cross reference this with flight tracking data to get
           | data based on the actual aircraft used (which solves edge
           | cases like substitutions or an airline operating aircraft
           | with differently configured A330-200s on different routes) -
           | doing that was part of the masterplan when I worked for them
           | before they acquired Flightstats.
        
         | knz wrote:
         | Nice job on your community website!
         | 
         | You may already be aware of this but if you want real-time
         | ADS-B, check out PiAware
         | (https://flightaware.com/adsb/piaware/) as a low cost option to
         | run your own ADS-B ground station via a raspberry pi.
        
       | einpoklum wrote:
       | The US Bureau of Transport statistics provides historic flight
       | schedule and actual flight performance data in CSV tables:
       | 
       | http://www.transtats.bts.gov/DL_SelectFields.asp?Table_ID=23...
       | 
       | But it's cumbersome to work with.
       | 
       | I am working (on and off) on a DBMS benchmark based on this data.
       | As part of that endeavor, I have a script which:
       | 
       | * Automates downloading the CSVs.
       | 
       | * Creates an appropriate SQL database schema.
       | 
       | * Performs a bit of rudimentary cleaning (e.g. invalid character
       | codes; optional)
       | 
       | * Loads the CSV files into the database.
       | 
       | So that, from the command-line, you could get the flight on-time
       | performance data by merely typing in something like:
       | /path/to/usdt-ontime-tools/scripts/setup-usdt-ontime-db -r  -db-
       | name ontime --first-year 2019 --last-year 2020
       | 
       | it's available within this repository:
       | 
       | https://github.com/eyalroz/usdt-ontime-tools
       | 
       | the caveat is that, for now, the only DBMS supported directly is
       | MonetDB: https://www.monetdb.org/ , a FOSS analytics-oriented
       | columnar DBMS.
       | 
       | An adaptation of the script for other systems (MySQL/Maria,
       | PostgreSQL) should be straightforward, since the commands are
       | SQL'ish after all. If you're interested in that, open an issue or
       | write me.
        
         | bernardv wrote:
         | Spent a lot of time with that dataset a couple of years ago,
         | looking at historical flight delays and cancellation rates. If
         | only this type of data was available outside of the US. The
         | data is updated daily I believe and provides a lot of detail
         | and is of pretty good quality.
        
       | fossforall wrote:
       | Very cool, disappointing about the recency of the data
        
       | rumblestrut wrote:
       | I wish there was open train data.
        
         | thamer wrote:
         | Here's my chance to plug something I wrote long ago (back in
         | 2012), and revived earlier this month. I used the
         | train/bus/riverboat schedules from Transport for London (TfL)
         | to create an animated view over 24 hours of every single
         | vehicle as it journeys through London. The 2021 version is in
         | 4K at 60 fps: https://www.youtube.com/watch?v=0rj60B7w59s -
         | with details posted here:
         | https://log.kv.io/post/2012/06/04/public-transports-in-londo...
         | 
         | "Open train data" is a bit vague without mentioning where these
         | trains might be, but I did find the London Tube schedule[1] in
         | GTFS[2] format, as well as the bus schedule[3] also in GTFS
         | format. Look for your city or country name followed by "open
         | data" and you might find interesting datasets. In the UK the
         | National Public Transport Data Repository (NPTDR) publishes a
         | database of every public transport journey in Great Britain for
         | a selected week in October each year[4] (only goes until 2011
         | though).
         | 
         | [1] Tube, scheduled trips: https://hash.ai/@tfl/tfl-gtfs
         | 
         | [2] GTFS is a CSV-based transit data format:
         | https://developers.google.com/transit/gtfs/reference
         | 
         | [3] Buses, scheduled trips: https://data.bus-
         | data.dft.gov.uk/timetable/download/
         | 
         | [4] NPTDR database:
         | https://data.gov.uk/dataset/d1f9e79f-d9db-44d0-b7b1-41c216fe...
        
       | radelaine wrote:
       | For those who want to jump in and query this dataset, I uploaded
       | it here: https://bit.io/boyd/airports
       | 
       | I'm still working on bit.io and would love feedback so hit me.
        
         | sm4rk0 wrote:
         | Thank you for discovering bit.io!
        
       | amatecha wrote:
       | You can get free enterprise access to a lot of the major flight
       | tracker services by setting up an ADSB receiver and feeding the
       | data to them. Basically they give you full access to everything
       | as if you were paying for the top tier of their services, because
       | you're helping increase the coverage of their data. A few such
       | services:
       | 
       | https://www.flightradar24.com
       | 
       | https://flightaware.com/
       | 
       | https://www.radarbox.com/
       | 
       | https://skyscanworld.com/
       | 
       | There's also https://www.adsbexchange.com/ which doesn't filter
       | their data (probably much to the chagrin of various businesses
       | and governments). If you see/hear a weird plane above and you
       | can't find it on the commercial services above, check ADSB
       | Exchange.
        
         | eigen wrote:
         | > You can get free enterprise access to a lot of the major
         | flight tracker services by setting up an ADSB receiver and
         | feeding the data to them. Basically they give you full access
         | to everything as if you were paying for the top tier of their
         | services, because you're helping increase the coverage of their
         | data.
         | 
         | "top tier" may be overstating it but setting up a RPi and $20
         | USB ASDB receiver will get you the $90/month Enterprise feed
         | [1]. Still a great deal if this is a topic that interests you.
         | 
         | [1] https://flightaware.com/adsb/
        
         | tpmx wrote:
         | Not the same thing. Scheduled future/historic flight data vs
         | observed realtime/historic flight data.
        
           | amatecha wrote:
           | They do give historic data, but yeah I don't know offhand
           | which ones give comprehensive direct API access just by
           | feeding ADS-B data. RadarBox seems to. FR24 and FA require
           | you to contact them -- I've never done this so I don't know
           | how much friction the process entails, or what kind of API
           | limits you may be subject to. Probably depends on your
           | intended application.
        
       | leugim wrote:
       | The site seems down. Archive.org backup:
       | https://web.archive.org/web/20210427143048/https://openfligh...
        
         | alifaziz wrote:
         | ..and link to github https://github.com/jpatokal/openflights
        
       | asix66 wrote:
       | Try the "crowdsourced" ADS-B Exchange site, which shows
       | unfiltered flight data. [0] For more info, check their FAQ.
       | 
       | Live data: https://globe.adsbexchange.com
       | 
       | [0] https://www.adsbexchange.com
        
         | notahacker wrote:
         | And if the OP has a strong personal interest in tracking
         | flights over his community, he should pay particular attention
         | to the page about antenna and setting up his own tracker
        
       | Fomite wrote:
       | Briefly got interested, then hit "Warning: The third-party that
       | OpenFlights uses for route data ceased providing updates in June
       | 2014. The current data is of historical value only."
        
         | tpmx wrote:
         | http://info.flightmapper.net/ is the gold standard for manual
         | use, as far as I'm concerned. Would be lovely to have
         | programmatic access to this data.
         | 
         | They say that they get their data from Cirium.
        
       | wyozi wrote:
       | Downloadable airport data on OpenFlights tends to be quite dated,
       | missing notably e.g. the Berlin Brandenburg Airport. OurAirports
       | (https://ourairports.com/) has a slightly different format but
       | the data there is significantly more recent.
       | 
       | source: started with OpenFlights but had to switch to OurAirports
       | for my project https://flightnotebook.com
        
       | jjwiseman wrote:
       | If you need airport or airspace data, openaip seems to be decent:
       | http://www.openaip.net/
       | 
       | For some reason they make you register in order to download the
       | data, and the site is a bit confusing, but the data seems good.
        
         | jjwiseman wrote:
         | Here's an app I wrote for FS2020 that uses OpenAIP airspace
         | data: https://twitter.com/lemonodor/status/1384611707314606090
        
       | phsource wrote:
       | It's ridiculous that the only original source of this data, the
       | IATA [0], charges $700+ for this list, so kudos to OpenFlights.
       | 
       | I can't stress just how important (and how hard) it is to get a
       | great source of data for airports -- I've now built 3 travel-
       | related projects (the latest, Wanderlog [https://wanderlog.com],
       | keeps people's flight reservations, so uses it for an
       | autocomplete), and it's been a key building block for all of
       | them.
       | 
       | The main datasets we use are:
       | 
       | - OpenFlights [1]: mentioned in this post, but this dataset was
       | great since it had timezone too.
       | 
       | - OurAirports [2]: no timezone here, but the "type" and
       | "scheduled_service" columns in this dataset are essential. "Type"
       | lets you distinguish between small/medium/large airports, and
       | "scheduled_service" lets you easily filter out airports without
       | real flights (which you often might not care about).
       | 
       | - Random other GitHub Gist [3]: I have no idea where this data
       | comes from, but it was surprisingly complete and has a few golden
       | nuggets like "num_flights" and "runway_length" in addition to
       | "timezone". The presence of a "woeid" suggests Yahoo-related
       | origins, but it's hard to be sure.
       | 
       | - We now supplement this with airports from autocomplete APIs
       | like Skyscanner's, because they're still the most up-to-date.
       | 
       | Long story short, it'd be AWESOME to have one complete, updated
       | database with all this data in one place. This kind of data
       | really should be public and a public service, but until then it's
       | unfortunately up to the community.
       | 
       | [0] https://www.iata.org/en/publications/store/airline-coding-
       | di...
       | 
       | [1] https://github.com/jpatokal/openflights/
       | 
       | [2] http://ourairports.com/data/
       | 
       | [3] https://gist.github.com/tdreyno/4278655
        
         | chillydawg wrote:
         | openstreetmap might happen to have a list of airports with
         | decent metadata, but it'll have zero info about actual flights.
        
           | angott wrote:
           | Yep. It is quite good for some data that volunteers can
           | provide from open data. Things like runway numbers, surface
           | material and length.
        
         | ctippett wrote:
         | FlightStats / Cirium have an API for airport data[0] that I've
         | found to be mostly complete (sans a few rural Australian
         | airports). It includes historical records for airports that are
         | no longer active, such as Hong Kong's Kai Tak airport that
         | previously went by the HKG IATA code.
         | 
         | FlightAware have a similar API[1].
         | 
         | These aren't free or open mind you, but are at least readily
         | accessible for those that need/want it.
         | 
         | [0] https://developer.flightstats.com/api-docs/airports/v1
         | 
         | [1] https://uk.flightaware.com/commercial/aeroapi/
        
         | squeaky-clean wrote:
         | I've done something similar for my current job. I've used all
         | these same data sources, even got access to the IATA stuff
         | eventually. I also used GeoNames a lot, it's not specific to
         | airports but it has decent airport data and I need a lot of the
         | surrounding features as well.
         | 
         | Every source was definitely useful, but I think ultimately
         | crawling Wikipedia was the most useful and highest quality set
         | of data for me (after some significant data cleaning). The List
         | of Airports By IATA Code [0] is almost as comprehensive as the
         | official list from IATA, and you can follow the links to crawl
         | info about the airport and city served. Getting info about what
         | city the airport is considered to "serve" is so useful, as most
         | airports are technically not in the city people consider them
         | to be the major airports of, and some "serve" multiple cities.
         | 
         | Of course the difficult part there is that Wikipedia data isn't
         | really clean or standardized. The page HTML isn't standard,
         | even things that look very standardized like the sidebar will
         | have 30 variations when you crawl all the airport pages. There
         | is WikiData, but I found it still wasn't simple to get the data
         | from there, and it also didn't include most of the page content
         | which I wanted. [1]
         | 
         | Nowadays we have direct relationships with the airlines/GDS/so
         | on, and also a department of people to add and manage the data
         | ourselves, because even the direct source gives you pretty poor
         | quality data. The project was way more fun when I was wrangling
         | data from a dozen places around the web :) Now it's more of an
         | enterprise CRUD webapp with some fancy localization and GIS
         | tooling.
         | 
         | [0]
         | https://en.wikipedia.org/wiki/List_of_airports_by_IATA_airpo...
         | 
         | [1] This was a while ago, so maybe WikiData has changed
        
         | sokoloff wrote:
         | If it's important and hard to get this data, is it really
         | ridiculous that a provider of the data charges $700 for it?
        
           | squeaky-clean wrote:
           | It's only hard to get because IATA doesn't easily provide it.
           | IATA isn't "a" provider of the data, they are the data. It
           | would be like if you had to purchase a list of the bus stops
           | and schedule in your city from your transportation
           | department.
        
             | sokoloff wrote:
             | Many (most?) standards bodies charge for their standards
             | documents and data feeds. There are obviously costs
             | associated with running IATA; I don't see why they should
             | be obliged to provide their data for free, especially when
             | the typical user of such data is likely to build a for-
             | profit business on top of it.
             | 
             | I don't think IATA actually assigns the codes, but rather
             | aggregates them. In the US, the FAA assigns the airport
             | identifiers:
             | https://www.faa.gov/nextgen/cip/airport_facility/
        
               | realityking wrote:
               | I think the FAA will only assign the four letter ICAO
               | code. The 3 letter IATA code is assigned by them
               | directly: https://www.iata.org/contentassets/1277d04d5758
               | 43dc80a3f613d...
        
               | sokoloff wrote:
               | If you tell me the FAA code for a US airport, I can tell
               | you the IATA code without a database lookup and without
               | checking with IATA.
        
             | vkou wrote:
             | I have to pay the bus company to ride the bus, it doesn't
             | seem insane that the bus company may want to charge me
             | money if I asked them for a full, comprehensive, organized
             | list of stops and schedules.
             | 
             | Sure, there are reasons for why they would want to make it
             | available for free, but there are also reasons for why they
             | would want to charge me, and they aren't unreasonable. I
             | don't have any fundamental, natural right to a
             | transportation network curating and providing their data
             | for my consumption. It might not care enough about my needs
             | to spend money do so. It might not care enough about my
             | needs to spend money to do so for free.
        
         | nonameiguess wrote:
         | Are you just looking for airports, routes, and schedules?
         | FlightAware provides that: https://flightaware.com/
         | 
         | Not sure what you get with the commercial services, but even
         | the free services are pretty good. It's what we used in 1st CAV
         | to track the redeployment of the last units to leave Iraq in
         | 2011.
        
       | ALittleLight wrote:
       | Why don't airlines provide a good free API for flights and
       | reservations? I would think they would want developers to help
       | make accessing their offerings and buying them easier.
        
         | J5892 wrote:
         | Skiplagged.com (and the legal issues around it) is a good
         | example of why they may not want that data easily accessible.
         | 
         | But other than that, I assume there's a lot of money in
         | partnerships with sites like Kayak and Priceline. But I'm not
         | even sure which direction that money flows.
        
       ___________________________________________________________________
       (page generated 2021-04-27 23:00 UTC)