java - Selecting innermost child of an element Jsoup -


i attempting scrape following html:

   <table>     <tr>         <td class="cellright" style="cursor:pointer;">             <table cellpadding="0" cellspacing="0" width="100%">                 <tr>                     <td class="cellright" style="border:0;color:#0066cc;"                     title="view summary" width="70%">92%</td>                      <td class="cellright" style="border:0;" width="30%">                     </td>                 </tr>             </table>         </td>     </tr>      <tr class="listroweven">         <td class="cellleft" nowrap><span class="categorytab" onclick=         "showassignmentsbympandcourse('08/03/2015','58100:6');" title=         "display assignments art 5 ms. martinho"><span style=         "text-decoration: underline">58100/6 - art 5 ms.         martinho</span></span></td>          <td class="cellleft" nowrap>             martinho, suzette<br>             <b>email:</b> <a href="mailto:smartinho@mtsd.us" style=             "text-decoration:none"><img alt="" border="0" src=             "/genesis/images/labelicon.png" title=             "send e-mail teacher"></a>         </td>          <td class="cellright" onclick=         "window.location.href = '/genesis/parents?tab1=studentdata&tab2=gradebook&tab3=coursesummary&studentid=100916&action=form&coursecode=58100&coursesection=6&mp=mp4';"         style="cursor:pointer;">             <table cellpadding="0" cellspacing="0" width="100%">                 <tr>                     <td class="cellcenter"><span style=                     "font-style:italic;color:brown;font-size: 8pt;">no                     grades</span></td>                 </tr>             </table>         </td>     </tr>      <tr class="listrowodd">         <td class="cellleft" nowrap><span class="categorytab" onclick=         "showassignmentsbympandcourse('08/03/2015','58200:10');" title=         "display assignments family , consumer sciences 5 sheerin">         <span style="text-decoration: underline">58200/10 - family ,         consumer sciences 5 sheerin</span></span></td>          <td class="cellleft" nowrap>             sheerin, susan<br>             <b>email:</b> <a href="mailto:ssheerin@mtsd.us" style=             "text-decoration:none"><img alt="" border="0" src=             "/genesis/images/labelicon.png" title=             "send e-mail teacher"></a>         </td>          <td class="cellright" style="cursor:pointer;">             <table cellpadding="0" cellspacing="0" width="100%">                 <tr>                     <td class="cellcenter"><span style=                     "font-style:italic;color:brown;font-size: 8pt;">no                     grades</span></td>                 </tr>             </table>         </td>     </tr> </table> 

i trying extract values student's grades, , if no grades present, value "no grades" present in html if case. however, when select request such following:

doc.select("[class=cellright]") 

i output of grade values listed twice (because nested within 2 elements containing [class=cellright] distinguisher, , normal amount of "no grades" listing. question is, how can select innermost child in document contains distinguisher [class=cellright]? (i have dealt issue of blank value) appreciated!!

there many possibilities to this.

one this: test each "cellright" element parents if carry class. discard if find it:

list<element> keeplist = new arraylist<>(); elements els = doc.select(".cellright"); (element el : els){   boolean keep = true;   (element parentel : el.parents()){      if (parentel.hasclass("cellright")){         //parent has class -> discard!         keep = false;         break;      }   }   if (keep){     keeplist.add(el);   } } //keeplist contains inner elements class 

note written without compiler , out of head. there might spelling/syntax errors.

other note. use of "[class=cellright]" works if there single class. multiple clsses in random order (which totally expected) better use dot syntax ".cellright"


Comments

Popular posts from this blog

yii2 - Yii 2 Running a Cron in the basic template -

asp.net - 'System.Web.HttpContext' does not contain a definition for 'GetOwinContext' Mystery -

mercurial graft feature, can it copy? -