Reprinted from TidBITS by permission; reuse governed by Creative Commons license BY-NC-ND 3.0. TidBITS has offered years of thoughtful commentary on Apple and Internet topics. For free email subscriptions and access to the entire TidBITS archive, visit http://www.tidbits.com/ Copytables Simplifies Extracting Tabular Data from Web Pages Adam Engst On the Web, tables are everywhere. You may not realize how many sites rely on tables behind the scenes for their formatting. Useful as they are for aligning content and displaying columnar data, tables can cause significant frustration if you need to extract data from them. I find myself wanting to do this quite often now that I have the open-source [1]Copytables. I use the [2]Copytables Chrome extension in Arc, Brave, and Google Chrome, but you can also download [3]Copytables for Firefox (forked from the Chrome version) and [4]Copytables for Safari. I haven't tested those. Here's an example of what Copytables makes possible. Earlier this week, I wanted to email someone a list of opportunities from the volunteer-management tool Helper Helper. To extract the text from the table in the Web app's interface without Copytables, I would have to select the entire table (below left), copy it, and paste it into BBEdit (below right). The results in BBEdit aren't terrible compared to some tables I've seen, but I'd still need to delete every other line. That would be doable with such a small table, but what if there were hundreds of lines or the data didn't break cleanly at line breaks? With Copytables in Arc, I instead pressed the Option key to enter cell-selecting mode and dragged over the cells in the leftmost column to select just them (below left). When I copied and pasted into BBEdit, I got exactly what I wanted (below right). Another example. I've been in a pitched battle with spambots for the last few weeks, and one of my most successful interim defenses has been blocking IP ranges. Using Copytables, I can extract hundreds of these quickly from the logs of our WordPress security plug-ins. To select the contents of a column, I press Command-Option and click the column header. Then it's trivial to copy the data into BBEdit for manipulation. Copytables also enables you to select rows or entire tables. Those features aren't as commonly used'they don't even get default keyboard shortcuts'so whenever I need them, I open the Copytables window (from the Extensions menu in Arc, or by clicking a pinned extension toolbar icon in Brave or Google Chrome), click Rows or Tables to enable the associated capture mode, and then click to select. If you do use the Capture buttons, make sure to disable them when you're done, or certain Web apps won't work correctly due to Copytables capturing their clicks. To select entire tables more quickly, click the Previous Table or Next Table buttons in the Find row. The Copy options at the top require more explanation. I haven't needed them, but they offer various formats for the copied data, some of which could be handy (I'm particularly taken with the options that swap columns and rows). Theoretically, you can set one of these options as the default to use with Edit > Copy, but that didn't work in my quick testing. Stick to the buttons in the Copytables interface. * As is: Copy the table as seen on the screen * Plain Table: Copy the table without formatting * Text: Copy as tab-delimited text * Text+Swap: Copy as tab-delimited text, swap columns and rows * CSV: Copy as comma-separated text * CSV+Swap: Copy as comma-separated text, swap columns and rows * HTML+CSS: Copy as HTML source with formatting * HTML: Copy as HTML source without formatting * Textile: Copy as [5]Textile (text content) * Textile+HTML: Copy as Textile (HTML content) Copytables has one other clever feature I find handy occasionally: the infobox. It's an inset box that shows information about your current selection. Consider this table of data about Canadian wildfires from 2000'2021. When I select the contents of the Area Burned column, Copytables displays the blue infobox at the top that counts the number of selected cells, calculates the sum and average, and calls out the min and max. These simple calculations can preclude the need to move data to a spreadsheet. If the infobox gets in your way, you can turn it off or have Copytables display it in a different corner of the window. To access this and other settings, click the Options link in the Copytables window. The most useful settings are the modifier keys for click-and-drag selection (below). You'll want to adjust these if they conflict with something else on your system. The Copytables window also has a Keyboard Shortcuts link that provides a browser-wide approach to setting keyboard shortcuts for extensions; the Copytables options match the Find and Capture buttons in its window. Copytables is free, but if you find it useful, you can [6]join me in donating to the author, Georg Barikin. References Visible links 1. https://merribithouse.net/copytables/ 2. https://chrome.google.com/webstore/detail/copytables/ekdpkppgmlalfkphpibadldikjimijon 3. https://addons.mozilla.org/en-US/firefox/addon/copywebtables/ 4. https://merribithouse.net/copytables/safari/ 5. https://en.wikipedia.org/wiki/Textile_%28markup_language%29 6. https://merribithouse.net/copytables/ Hidden links: 7. https://tidbits.com/wp/../uploads/2023/08/Without-Copytables.png 8. https://tidbits.com/wp/../uploads/2023/08/With-Copytables.png 9. https://tidbits.com/wp/../uploads/2023/08/Copytables-columns.png 10. https://tidbits.com/wp/../uploads/2023/08/Copytable-interface.png 11. https://tidbits.com/wp/../uploads/2023/08/Copytables-infobox.png 12. https://tidbits.com/wp/../uploads/2023/08/Copytables-modifier-keys.png .