Author Topic: Advise RE strange clipping phenomenon  (Read 2601 times)

Offline amylase

  • Friend
  • **
  • Posts: 75
    • View Profile
    • gpLand
Advise RE strange clipping phenomenon
« on: February 17, 2009, 10:17:37 AM »
I added replace_string and sscanf to wget in effect turning it into kind of text web browser. Nothing fancy, just download a .htm page, strip off all tags, then print out the result on to screen. A problem I encounter now is strange clipping phenomenon of text. I don't know how else to describe it than to illustrate it with a specific example.

Here is the code for cat10.c
Code: [Select]
#include <lib.h>
#include <daemons.h>
inherit LIB_DAEMON;
mapping FilesMap = ([]);

varargs int cmd(string str){
    object wget = new(WGET_D, this_object());
    if(!FilesMap) FilesMap = ([]);   
    FilesMap[wget] = this_player();
    wget->eventGet(str);
    return 1;
}

string refine(string kontent){
// who->eventPrint(kontent);

string *arr;
string *cap;
string *e;
    string s1,s2,s3, rep1,rep2, t1,t2,t3; string st,en;
string b;
string c1="<td class=\"TWFTTitle\">";
string c2="<td  class=\"TWFTBibleReading\">";
string *r1 = ({"","&nbsp;","&nbsp", "nbsp", "<br>", "<b>", "</b>", "-->", "\t"}); // Replace these strings with
string *r2 = ({""," ", " "," ","\n", "%^BOLD%^YELLOW%^", "%^RESET%^", "", ""}); // these strings.
string *start = ({"","<HEAD>","<script ", "<"}); // Remove everything between these and
string *end = ({"","</HEAD>","</script>", ">"}); // these (inclusive).

int feedback=3, count=1, checkcap=3;
string tmp=kontent;

//Replacing
count=1;
while(count < sizeof(r1)){
rep1=r1[count]; rep2=r2[count];
tmp=replace_string(tmp,rep1,rep2);
count++;
}

//Splicing
 count=1;
 while(count < sizeof(start)){
while(feedback>0){
feedback=sscanf(tmp,"%s"+start[count]+"%s"+end[count]+"%s", s1, s2, s3);
if(s2) {if(!s1) s1=""; if(!s3) s3=""; tmp=s1+s3;} //!!!!I reckon problem is here or line above!!!!
}
feedback=3; count++;
}


return(tmp);
}


varargs mixed eventReceiveWebData(string content, string file){
    object wget = previous_object();
    object who;
string finale;
    if(!FilesMap[wget] || !(who = FilesMap[wget])){
        map_delete(FilesMap, wget);
        return 1;
    }

finale=refine(content);
who->eventPrint(finale);
    return 1;
}


int help()
{
    write( @EndText
Syntax: cat10 <url>

EndText
    );
    return 1;
}

Here is what I typed in the game to use cat10 to access a web site
Quote
cat10 http://www.ucb.com.au/index.htm

Here is the actual HTML content of index.htm accessed on February 18th
Code: [Select]






 

 

 


<html>
<head>
  <link rel="stylesheet" type="text/css" href="styles.css">
  <script type="text/javascript" src="left.js"></script>
  <meta http-equiv="imagetoolbar" content="no">

  <title>UCB - Connecting Faith to Life</title>
</head>
<body>
<script type='text/javascript' src='exmplmenu_var.js'></script>
<script type='text/javascript' src='menu_com.js'></script>
<noscript>Your browser does not support script</noscript>

<TABLE WIDTH="100%"  BORDER="0" CELLSPACING="0" CELLPADDING="0">
<TR><TD>

<TABLE BORDER="0" CELLPADDING="0" CELLSPACING="0" BACKGROUND="images/headbck.gif" WIDTH="100%">

<TR>
<TD><img alt=''  SRC="images/head1_1.gif" BORDER="0"><img alt=''  SRC="images/head1_2.gif" BORDER="0"><img alt=''  SRC="images/head1_3.gif" BORDER="0"><img alt=''  SRC="images/head1_4.gif" BORDER="0"><img alt=''  SRC="images/head1_5.gif" BORDER="0"><img alt=''  SRC="images/head1_6.gif" BORDER="0"><img alt=''  SRC="images/head1_7.gif" BORDER="0"><img alt=''  SRC="images/head1_8.gif" BORDER="0"><img alt=''  SRC="images/head1_9.gif" BORDER="0"</TD>
</TR>
</TABLE>

<TABLE BORDER="0" CELLPADDING="0" CELLSPACING="0" WIDTH="100%" BGCOLOR="#FFFFFF">
<TR>
<TD><img alt=''  SRC="images/menublock.gif" WIDTH="780" HEIGHT="17" BORDER="0"></TD>
</TR>
</TABLE>


<TABLE WIDTH="100%" BORDER="0" CELLPADDING="0" CELLSPACING="0">

<TR>
<TD VALIGN="TOP" width="160px">

<TABLE BORDER=0 CELLPADDING=0 CELLSPACING=0 ALIGN="LEFT">
<TR>
<TD><br /><A HREF="/" ONMOUSEOVER="on_img(p6,12);return true" ONMOUSEOUT ="on_img(p6,11);return true"><img alt=''  SRC="images/ic_home.gif" name="p6" WIDTH="100" HEIGHT="24" BORDER="0" VSPACE="0" HSPACE="0"></A></TD>
</TR>
<TR>
<TD><A HREF="https://secure.ucb.com.au" ONMOUSEOVER="on_img(p1,2);return true" ONMOUSEOUT="on_img(p1,1);return true"><img alt='' SRC="images/ic_donations.gif" name="p1" WIDTH="100" HEIGHT="24" BORDER="0" VSPACE="0" HSPACE="0"></A></TD>

</TR>
<TR>
<TD><A HREF="vacancies.htm" ONMOUSEOVER="on_img(p7,14);return true" ONMOUSEOUT="on_img(p7,13);return true"><img alt='' SRC="images/ic_vacancies.gif" name="p7" WIDTH="100" HEIGHT="24" BORDER="0" VSPACE="0" HSPACE="0"></A></TD>
</TR>
<TR>
<TD><A HREF="contact.htm" ONMOUSEOVER="on_img(p4,8);return true" ONMOUSEOUT ="on_img(p4,7);return true"><img alt=''  SRC="images/ic_contact.gif" name="p4" WIDTH="100" HEIGHT="24" BORDER="0" VSPACE="0" HSPACE="0"></A></TD>
</TR>
<TR>
<TD><A HREF="links.htm" ONMOUSEOVER="on_img(p5,10);return true" ONMOUSEOUT ="on_img(p5,9);return true"><img alt=''  SRC="images/ic_links.gif" name="p5" WIDTH="100" HEIGHT="24" BORDER="0" VSPACE="0" HSPACE="0"></A></TD>

</TR>
<TR>
<TD><A ONMOUSEOVER="on_img(p2,4);return true" ONMOUSEOUT ="on_img(p2,3);return true" TARGET='popup' HREF="#" ONCLICK="window.open('chat.htm#1', 'popup',
'toolbar=no,scrolling=auto,status=no,menubar=no,scrollbars=no,resizable=yes,width=650,height=550'); return false" ><img alt=''  SRC="images/ic_chat.gif" name="p2" WIDTH="100" HEIGHT="24" BORDER="0" VSPACE="0" HSPACE="0"></A></TD>
</TR>
<TR>
<TD>
<br /><br />

<A HREF="http://www.vision.org.au"><img alt='' TITLE="Visit VISION Online" SRC="images/logo_vision.jpg" WIDTH="105" HEIGHT="26" BORDER="0"></A><br /><br />
<A HREF="http://www.thewordfortoday.com.au"><img alt='' TITLE="The Word For Today" SRC="images/twftlogo.gif" WIDTH="100" HEIGHT="50" BORDER="0"></A><br /><br /><br />

<A HREF="http://www.ucbdirect.com.au"><img alt='' TITLE="Purchase Products Online" SRC="http://www.thewordfortoday.com.au/images/ucbdirect.gif" WIDTH="105" HEIGHT="52" BORDER="0"></A><br /><br />
<A HREF="http://www.prayforme.com.au"><img alt='' TITLE="Need Prayer" SRC="images/prayforme-logo.gif" WIDTH="103" HEIGHT="30" BORDER="0"></A><br /><br />
<A HREF="http://www.word4u2day.com.au"><img alt='' TITLE="Word 4 U 2day" SRC="images/W4U2Day-logo.gif" WIDTH="103" BORDER="0"></A><br /><br /><br />
<!-- <A HREF="needhelp.htm"><img alt=''  SRC="images/needhelp.gif" WIDTH="100" HEIGHT="51" BORDER="0"></A><br /> -->
</TD>
</TR>
</TABLE>

</TD>
<TD VALIGN="TOP">



<center>
<span class="cmteaser">
<span class="cmteaser">
<a href="http://www.vision.org.au/vis08b_main.htm" target="_blank"><img src="http://www.vision.org.au/images/Visionathon-banner_totala.gif" border="0" /></a><a href="https://secure.ucb.com.au" target="_blank"><img src="http://www.vision.org.au/images/Visionathon-banner_totalb.gif" border="0" /></a><br />

</center>
<br><center><a href="http://www.vision.org.au/vis08b_main.htm">Latest Updates</a> | <a href="http://www.vision.org.au/vis08b_why.htm">Why?</a> | <a href="http://www.vision.org.au/vis08b_gallery.htm">Gallery</a> | <a href="http://www.vision.org.au/vis08b_stories.htm">Testimonies</a> | <a target="_blank" href="https://secure.ucb.com.au/"><B>DONATE</B></a></br></center>

<div align="left"><br />
  <br />

</div>

<TABLE CELLPADDING="10" CELLSPACING="0" BORDER="0">
<TR>
<TD WIDTH="80%" VALIGN="TOP">


 






<br />
<table width="100%" border="0" cellspacing="0" cellpadding="2" align="left">
<tr>
<td width="100%" class="TWFTDate" align="left">Wednesday 18 February 2009</td>

</tr>

<tr>
  <td class="TWFTTitle">The privileges and responsibilities of membership</td>
</tr>
<tr align="center">
  <td>
    "<span class="TWFTScripture">You are...fellow citizens...members of God's household... a holy temple.</span>"<br />

    <span class="TWFTScriptureReference">Ephesians 2:19-21 NIV</span>   </td>
</tr>
<tr valign="top">
  <td class="TWFTContent">
  <p align="justify"> If your children stood outside your house pleading to get in, what would you think? Wouldn't you say, "Come in, you're my flesh and blood, I love you, you don't need to beg?" Well, we can come into God's presence at any time. We are "No longer foreigners...but fellow citizens...of the household of God...a holy temple" (Eph 2:19-21 NIV). What privileges:
<br/> (a) As "fellow citizens" we represent God's Kingdom on the earth. We are His ambassadors (See 2 Cor 5:20). "What does an ambassador do?" you ask. He stays in communication with his king, understands his will and makes sure it's carried out. He also knows he doesn't belong there permanently, so he lives ready for recall at a moment's notice. Getting the idea? (b) Because we belong to the "household of God" we can come confidently before God at any time, with any need, and know that we'll be received with love. God is the father you always hoped for and you are the child He always wanted. If you have any doubts, look at the cross; that's how much God values you. But remember, every family member is supposed to contribute, be loyal, and make sure the family's good name is protected. (c) We are "a holy temple". In the Old Testament God had a temple for His people, but now God has a people for His temple. The Bible says, "Do you not know that your body is a temple of the Holy Spirit, who is in you, whom you have received from God? You are not your own; you were bought at a price. Therefore honour God" (1 Cor 6:19-20 NIV).</p> </td>

  </tr>
  <tr>
<td  class="TWFTBibleReading">
  <b>Soulfood Bible Readings</b>&nbsp;&nbsp;<img src="images/bible.gif" width="17px" height="12px" border="0" /><br />
  Rom 12-14, Matt 15:15-28, Ps 28, Pr 5:3-6 </td>
  </tr>

<tr>
<td align="center">

  <p><span class="TWFTContent">
    <br />
    <br />
    To view any more information please visit our <b>NEW</b> Word For Today Site at<br />
    <a href="http://www.thewordfortoday.com.au?rf=ucb" target="_blank">http://www.thewordfortoday.com.au</a>
          <br />

    &copy; <a href="http://www.thewordfortoday.com.au/copyright.htm">Copyright</a> 1997-2008. All Rights Reserved. </span></p>
  <p>
        <a href="Gerald_Rowlands_prophecy_summary.pdf"><img src="images/geraldrowlands_prophecy1.gif" width="600" height="60" border="0" /></a></p></td>
</tr>
</table>


  </TD>


<td width="160" valign="top" align="center">
<br /><br /><br />
<P>
<CENTER><img alt=''  SRC="images/twftlogo.gif" WIDTH="100" HEIGHT="50" BORDER="0"></CENTER>
<br /><br />
Written by Bob & Debby Gass<br /><br />
<a href="author.htm">More about the author</a>
</td>

</tr>
</table>

</TD>
</TR>
</TABLE>
</TD></TR>
<TR><TD VALIGN="BOTTOM">
<TABLE BORDER="0" CELLPADDING="0" CELLSPACING="0" WIDTH="100%" BACKGROUND="images/bot_bck.gif">
<TR>
<TD VALIGN="BOTTOM" ALIGN="LEFT"><P><FONT FACE="Arial" SIZE="1" COLOR="#306898">&nbsp;<B>Phone</B> +61 (7) 3387 7300  <B>Facsimile </B>+61 (7) 3387 7333

<B>FREECALL</B>&nbsp;1800&nbsp;007&nbsp;770.  <a href="privacy.htm">Privacy Policy</a><br /></FONT><A HREF="http://www.ucb.com.au"><img alt=''  SRC="images/space.gif" WIDTH="103" HEIGHT="19" BORDER="0" HSPACE="0" VSPACE="0"></A></TD>
<TD VALIGN="BOTTOM"><img alt=''  SRC="images/tag.gif" WIDTH="246" HEIGHT="19" BORDER="0" HSPACE="0"></TD>
<TD ALIGN="RIGHT"><img alt=''  SRC="images/bot_corner.gif" USEMAP="#home" BORDER=0><MAP NAME="home"><AREA alt='' SHAPE=RECT COORDS="102, 21, 131, 50" HREF="/"></MAP></TD>
</TR>
</TABLE>


</TD></TR></TABLE>
<script src="http://www.google-analytics.com/urchin.js" type="text/javascript">
</script>

<script type="text/javascript">
_uacct = "UA-1035608-1";
urchinTracker();
</script>
</body>
</html>


This is part of the result I receive on screen unfortunately (notice clipping around (a)):
Quote
Wednesday 18 February 2009



  The privileges and responsibilities of membership


 
    "You are...fellow citizens...members of God's household... a holy temple."
    Ephesians 2:19-21 NIV 


 
   If your children stood outside your house pleading to get in, what would you
think? Wouldn't you say, "Come in, you're my flesh and blood, I love you, you
don't need to beg?" Well, we can come into God's presence at any time. We are
"No longer foreigners...but fellow citizens...of the household of God...a holy
(a) As "fellow citizens" weWhat privileges:

represent God's Kingdom on the earth. We are His ambassadors (See 2 Cor 5:20).
"What does an ambassador do?" you ask. He stays in communication with his king,
understands his will and makes sure it's carried out. He also knows he doesn't
belong there permanently, so he lives ready for recall at a moment's notice.
Getting the idea? (b) Because we belong to the "household of God" we can come
confidently before God at any time, with any need, and know that we'll be
received with love. God is the father you always hoped for and you are the
child He always wanted. If you have any doubts, look at the cross; that's how
much God values you. But remember, every family member is supposed to
contribute, be loyal, and make sure the family's good name is protected. (c) We
are "a holy temple". In the Old Testament God had a temple for His people, but
now God has a people for His temple. The Bible says, "Do you not know that your
body is a temple of the Holy Spirit, who is in you, whom you have received from
God? You are not your own; you were bought at a price. Therefore honour God" (1
Cor 6:19-20 NIV).
 
 

  Soulfood Bible Readings 
  Rom 12-14, Matt 15:15-28, Ps 28, Pr 5:3-6

I traced my code and was able to localise scanf to be the source of problem but am unsure how to rectify the situation. I tried filtering out tabs (\t) and it didn't quite fix the problem either. Any advice would be greatly appreciated.

Offline cratylus

  • Your favorite and best
  • Administrator
  • ***
  • Posts: 1020
  • Cratylus@Dead Souls <ds> np
    • View Profile
    • About Cratylus
Re: Advise RE strange clipping phenomenon
« Reply #1 on: April 15, 2009, 09:58:36 AM »
I tried your code. That particular religious page seems
to be different now, but the problem is still visible.
What seems to happen is that some text you can see in
your regular browser seems to be omitted from the display
of your command.

I noticed that your command does not save that incoming
data. Doing this is important for troubleshooting. We need
to know whether the problem lies in the wget daemon,
or in your code. I modified your command so that it saves
the data, for comparison with the output.

The output I got for today contains the following:

Quote
(1) Save for ais concerned, the Bible teaches us that we are to:

In my browser, that looks like:

Quote
Where money is concerned, the Bible teaches us that we are to:
(1) Save for a rainy day.

On close inspection this is not likely to be an omission
of text, but rather some kind of escape code problem.
At first blush I'd look into stripping something out, since
what it looks like is that the second line overwrites
the first line in your display, without a line break to
separate them.

To check whether the problem is your command or wget, we can
inspect what it looks like in the saved file:

Code: [Select]
<p align="justify">Where money is concerned, the Bible teaches us that we are to:^M<br/>      (1) Save for a rainy day.

O ho! We see that the text is not omitted, but perhaps more
interestingly, there is a ^M sitting right there. Suppose that
character does not get recognized properly as a line break?
Let's try changing your code here:

Code: [Select]
string *r1 = ({"","&nbsp;","&nbsp", "nbsp", "<br>", "<b>", "</b>", "-->", "\t"}); // Replace these strings with
string *r2 = ({""," ", " "," ","\n", "%^BOLD%^YELLOW%^", "%^RESET%^", "", ""}); // these strings.
   


with:

Code: [Select]
    string *r1 = ({"","&nbsp;","&nbsp", "nbsp", "<br>", "<b>", "</b>", "-->", "\t","\r"}); // Replace these strings with
    string *r2 = ({""," ", " "," ","\n", "%^BOLD%^YELLOW%^", "%^RESET%^", "", "", "\n"}); // these strings.

Thereby turning "carriage return" into "newline"

The results...
Quote
  Where money is concerned, the Bible teaches us that we are to:
(1) Save for a rainy day. "There is...treasure...in the dwelling of the wise.

Yippee!

-Crat

Offline amylase

  • Friend
  • **
  • Posts: 75
    • View Profile
    • gpLand
Re: Advise RE strange clipping phenomenon
« Reply #2 on: April 15, 2009, 10:46:33 AM »
You are a champion. Thanks very much.