Usenet.com

www.Usenet.com

Group Index

Comp Thread Archive from Usenet.com

<-- __Chronological__ --> <-- __Thread__ -->

Re: Finding an HTML element



"Gilad Novik" <[EMAIL PROTECTED]> wrote in message news:<[EMAIL PROTECTED]>...
> Hi,
> 
> I want to build a tool for data mining from an html page. I want the user to
> select an element from a web page, and train my application to recognize it
> in its later updates. For example, suppose the user wants to extract some
> data from a financial. He want to extract his total balance, plus the table
> of the last transactions. What he should do is to highlight the elements
> inside the html page. After doing that, the application should analyze the
> html element structure, and learns how to find it in similar pages (even
> when they are not identical). What I really need is an algorithm to
> "understand" a single element (by it's structure, position in page or any
> other methods), and then I want to look in a new page, and choose the most
> similar element (which should probably be the right one).
> 
> Does anyone has an idea for it?
> 
> Regards,
>     Gilad Novik
> 

try:

Wrapper Induction for Information Extraction
Nicholas Kushmerick

http://citeseer.nj.nec.com/kushmerick97wrapper.html

[ comp.ai is moderated.  To submit, just post and be patient, or if ]
[ that fails mail your article to <[EMAIL PROTECTED]>, and ]
[ ask your news administrator to fix the problems with your system. ]



<-- __Chronological__ --> <-- __Thread__ -->


Usenet.com




Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.




Please check out one of the premium Usenet Newsgroup Service Providers below for access to Usenet.