webscraping

Extract tables from pdf files with tabulizer

Far too often i find myself in a situation where I need to fetch lists of genes, expression data or similar from journal articles, only to to realize that the data is only to be found buried somewhere deep within the supplementary in the form of a giant pdf. (The horror!) Here is a how to to scrape data from a linked pdf file (by url) using the tabulizer R package.