
www.Usenet.com
| <-- __Chronological__ --> | <-- __Thread__ --> |
"Gilad Novik" <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>... > Hi, > > I want to build a tool for data mining from an html page. I want the user to > select an element from a web page, and train my application to recognize it > in its later updates. For example, suppose the user wants to extract some > data from a financial. He want to extract his total balance, plus the table > of the last transactions. What he should do is to highlight the elements > inside the html page. After doing that, the application should analyze the > html element structure, and learns how to find it in similar pages (even > when they are not identical). What I really need is an algorithm to > "understand" a single element (by it's structure, position in page or any > other methods), and then I want to look in a new page, and choose the most > similar element (which should probably be the right one). > > Does anyone has an idea for it? > > Regards, > Gilad Novik > try: Wrapper Induction for Information Extraction Nicholas Kushmerick http://citeseer.nj.nec.com/kushmerick97wrapper.html [ comp.ai is moderated. To submit, just post and be patient, or if ] [ that fails mail your article to <[EMAIL PROTECTED]>, and ] [ ask your news administrator to fix the problems with your system. ]
| <-- __Chronological__ --> | <-- __Thread__ --> |
Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.