Data extraction from texts and images is a fundamental human action. Anytime we read a book or newspaper we’re extracting data whether we know it or not.
Beyond everyday reading, data extraction is a key part of gathering information for almost any endeavor and spans all fields of work. Investors scan news items for companies of interest and stock prices, scientists read publications to extract data relevant to their own studies, auto mechanics look for torque specification for tightening bolts, etc… These are targeted data extractions wherein the person is looking for and extracting specific information from the content and the data elements to be extracted can be defined beforehand. Automation of this type of targeted data extraction would save a tremendous amount of human resources for organizations that depend on extracting data from published material, particularly considering the ever-increasing amount of such material available. The Seeker is interested in gathering and comparing the performance of different algorithms and methods for automated data extraction and will provide specific datasets and data elements to extract for this Challenge.
I am looking for an ideal solution like a tool that could perform all meaningful and relevant information/data extraction from texts, graphs and images.
If somebody have a solution it would be perfect.