Joe Wilson

Joe Wilson

  • NA
  • 7.8k
  • 416.1k

How to compare the tag of HTML files in C++?

Nov 30 2014 9:16 AM
 
      I want to compare several HTML files which contains many string tags so I am looking for the best ways which have less time complexity .
For Example:
There are 2 HTML files below which contain many tags which are string so I want to compare them and count the similar tags at last by using these formulas:
 
1. Average of similar tags of each two HTML files = (Quantity of same tags in first file) + (Quantity of same tags in second file) / (Sum of all second column of first file) + (Sum of all second column of second file).
 
2. Main Function for calculating: F ( File1,File2 ) = ((Quantity of tags which are same in both files) / ( Quantity of all tags of first and second files - Quantity of tags which are same in both files ) ) * (Average of similar tags of each two HTML files)
 
counter: which has the quantity of similar tags of two HTML files.
Note: The second columns contain the quantity of each tag in current HTML file.
 
The first HTML file:
 
joiuh
12
@62jj
10
k6235
2
99ui*
3
00Qyu679
*8455
7
Sum of all second column = 43
 
The Second HTML File :
 
00Qyu67 20
NY%%%20
1
UWCN10
13
89PO* 6
$$CS40 11
@62jj 56
k6235 10
Sum of all second column = 117
 
In this example:
 
Average of similar tags of each two HTML files: ( ( 10 + 9 + 2 ) + ( 56 + 20 + 10 ) ) / ( ( 43 ) + ( 117 ) ) = 107 / 160 = 0.66
 
F ( File1,File2 ) = ( ( 3 ) / ( 13 - 3 ) ) * 0.66 = 0.198

Answers (81)