TITLE: R package to access the World Flora Online GraphQL API DATE: 2025-11-03 AUTHOR: John L. Godlee ==================================================================== I have been working on integrating the WorldFlora R package into SEOSAW so we can use the World Flora Online taxonomic backbone to match taxonomic names in our tree inventory data. In doing this work I have learned that: a) the WorldFlora R package has some drawbacks, namely that it requires a large download of a static version of the WFO database, which also then needs to be read into memory in R, and b) that the WFO has a GraphQL API to access their database remotely without downloading the entire thing. [have been working]: /posts/work/2025-07-09-taxon_check.html [WorldFlora R package]: https://cran.r-universe.dev/WorldFlora/doc/manual.html [SEOSAW]: https://seosaw.github.io [World Flora Online]: https://www.worldfloraonline.org/ [GraphQL API]: https://wfo-about.rbge.info/gql_index.php I have since developed a prototype R package which uses the API to return information from the WFO database, and contains functions for matching taxonomic names. The package can be found here: https://github.com/johngodlee/wfoAPI The top-level function is matchNames(), which takes a vector of taxonomic names and queries the WFO database for each name. Notable features of the function are: - Leverages the fuzzy matching algorithm which is implemented server-side by WFO, and you can tweak parameters to control the behaviour of the algorithm. - When multiple possible names are matched, the user can optionally enter an interactive selection mode to pick a specific name. - When a matched name has a synonym, the function returns the accepted name as well. - Previous API calls are cached locally, to reduce the number of API calls, and to prevent entering interactive mode multiple times for the same name. callAPI() is the base-level function which constructs the API query and uses the httr2 package to send the query and unpack the returns. matchName() calls callAPI() and is responsible for handling a single name query. matchName() optionally uses cached data from previous API calls if it exists. If multiple candidate names are matched, the user can optionally enter an interactive selection mode, which is handled by pickName(). matchNames() calls matchName() for each name in the vector of taxonomic names, performs various quality-of-life checks like exiting gracefully if the WFO API is not reachable, and formats the results as a pretty dataframe with one row for each name in the original vector. These are the arguments available for matchNames(): - x - vector of taxonomic names - fallbackToGenus - logical, if TRUE genus-level matches will be returned if no species-level match is available - checkRank - logical, if TRUE consider matches to be ambiguous if it is possible to estimate taxonomic rank from the search string and the rank does not match that in the name record - checkHomonyms - logical, if TRUE consider matches to be ambiguous if there are other names with the same words but different author strings - fuzzyNameParts - integer value of 0 (default) or greater. The maximum Levenshtein distance used for fuzzy matching words in x - interactive - logical, if TRUE (default) user will be prompted to pick names from a list where multiple ambiguous matches are found, otherwise names with multiple ambiguous matches will be skipped - useCache - logical, if TRUE use cached values in options("wfo.api_uri") preferentially, to reduce the number of API calls - useAPI - logical, if TRUE (default) allow API calls - raw - logical, if TRUE raw a nested list is returned, otherwise a dataframe fallbackToGenus, checkRank, checkHomonyms and fuzzyNameParts are all variables passed directly to the WFO GraphQL API. Here is a basic example: x <- c( "Burkea africana", "Julbernardia paniculata", "Fabaceae", "Indet indet", "Brachystegia", "Philenoptera sp.") matchNames(x) The console output: 1 of 6: Brachystegia 2 of 6: Burkea africana 3 of 6: Fabaceae 4 of 6: Indet indet No candidates, skipping: Indet indet 5 of 6: Julbernardia paniculata 6 of 6: Philenoptera sp. --- Pick a name --- Matching string: Philenoptera sp. 1 wfo-4000029211 Philenoptera Hochst. ex A.Rich. accepted Code/Plantae/Pteridobiotina/Angiosperms/Fabales/Fabaceae/Papilionoid eae/Philenoptera [ins] Enter a number to pick a row from the list, a valid WFO ID, 'N' for the next page, 'P' for the previous page, 'S' to skip this name: The dataframe returned, formatted as a nested list for readability: - Row 1 - taxon_name_subm: Burkea africana - method: AUTO - fallbackToGenus: FALSE - checkRank: FALSE - checkHomonyms: FALSE - fuzzyNameParts: 0 - taxon_wfo_syn: wfo-0000214110 - taxon_name_syn: Burkea africana - taxon_auth_syn: Hook. - taxon_stat_syn: valid - taxon_role_syn: accepted - taxon_rank_syn: species - taxon_path_syn: Code/Plantae/Pteridobiotina/Angiosperms/Fabales/Fabaceae/Caesalpinio ideae/Burkea/africana - taxon_wfo_acc: wfo-0000214110 - taxon_name_acc: Burkea africana - taxon_auth_acc: Hook. - taxon_stat_acc: valid - taxon_role_acc: accepted - taxon_rank_acc: species - taxon_path_acc: Code/Plantae/Pteridobiotina/Angiosperms/Fabales/Fabaceae/Caesalpinio ideae/Burkea/africana - Row 2 - taxon_name_subm: Julbernardia paniculata - method: AUTO - fallbackToGenus: FALSE - checkRank: FALSE - checkHomonyms: FALSE - fuzzyNameParts: 0 - taxon_wfo_syn: wfo-0000169220 - taxon_name_syn: Julbernardia paniculata - taxon_auth_syn: (Benth.) Troupin - taxon_stat_syn: valid - taxon_role_syn: accepted - taxon_rank_syn: species - taxon_path_syn: Code/Plantae/Pteridobiotina/Angiosperms/Fabales/Fabaceae/Detarioidea e/Julbernardia/paniculata - taxon_wfo_acc: wfo-0000169220 - taxon_name_acc: Julbernardia paniculata - taxon_auth_acc: (Benth.) Troupin - taxon_stat_acc: valid - taxon_role_acc: accepted - taxon_rank_acc: species - taxon_path_acc: Code/Plantae/Pteridobiotina/Angiosperms/Fabales/Fabaceae/Detarioidea e/Julbernardia/paniculata - Row 3 - taxon_name_subm: Fabaceae - method: AUTO - fallbackToGenus: FALSE - checkRank: FALSE - checkHomonyms: FALSE - fuzzyNameParts: 0 - taxon_wfo_syn: wfo-7000000323 - taxon_name_syn: Fabaceae - taxon_auth_syn: Lindl. - taxon_stat_syn: conserved - taxon_role_syn: accepted - taxon_rank_syn: family - taxon_path_syn: Code/Plantae/Pteridobiotina/Angiosperms/Fabales/Fabaceae - taxon_wfo_acc: wfo-7000000323 - taxon_name_acc: Fabaceae - taxon_auth_acc: Lindl. - taxon_stat_acc: conserved - taxon_role_acc: accepted - taxon_rank_acc: family - taxon_path_acc: Code/Plantae/Pteridobiotina/Angiosperms/Fabales/Fabaceae