<?xml version="1.0" encoding="UTF-8"?> <rss
version="2.0"
xmlns:content="http://purl.org/rss/1.0/modules/content/"
xmlns:wfw="http://wellformedweb.org/CommentAPI/"
xmlns:dc="http://purl.org/dc/elements/1.1/"
xmlns:atom="http://www.w3.org/2005/Atom"
xmlns:sy="http://purl.org/rss/1.0/modules/syndication/"
xmlns:slash="http://purl.org/rss/1.0/modules/slash/"
><channel><title>DSCRIPTS</title> <atom:link href="http://www.dscripts.net/feed/" rel="self" type="application/rss+xml" /><link>http://www.dscripts.net</link> <description>Meet your development needs</description> <lastBuildDate>Tue, 13 Dec 2011 06:34:00 +0000</lastBuildDate> <language>en</language> <sy:updatePeriod>hourly</sy:updatePeriod> <sy:updateFrequency>1</sy:updateFrequency> <generator>http://wordpress.org/?v=3.3</generator><meta
name="generator" content="deStyle 0.9.3" /> <item><title>How to force download a file using ASP.NET</title><link>http://www.dscripts.net/2011/12/08/how-to-force-download-a-file-using-asp-net/</link> <comments>http://www.dscripts.net/2011/12/08/how-to-force-download-a-file-using-asp-net/#comments</comments> <pubDate>Thu, 08 Dec 2011 05:26:15 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[ASP.NET]]></category> <category><![CDATA[C#]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=422</guid> <description><![CDATA[There are many cases where we need to let file downloads being processed by an aspx file. Here in this code this code will force a file to be downloaded through an aspx file. let assume your file path  is in &#8220;string path&#8221; variable. Add this code in your aspx file cs file. &#160;]]></description> <content:encoded><![CDATA[<p>There are many cases where we need to let file downloads being processed by an aspx file. Here in this code this code will force a file to be downloaded through an aspx file. let assume your file path  is in &#8220;string path&#8221; variable. Add this code in your aspx file cs file.</p><p>&nbsp;</p><pre class="brush: csharp; title: ; notranslate">

string path = &quot;the path to your file&quot;;
string fileName = &quot;Add your file name&quot;;
string contentType= &quot;Add your file content type. ex: for text file it is 'text/plain'&quot;;
string contentLength = &quot;add string presentation of your file size in bytes&quot;;

response.Clear();
response.ClearHeaders();
response.AddHeader(&quot;Content-Disposition&quot;, &quot;attachment; filename=&quot; + fileName);
response.AddHeader(&quot;Content-Length&quot;, contentLength);
response.ContentType = contentType;
FileStream f = new FileStream(path, FileMode.Open);
byte[] buffer = new byte[8 * 1024];
int len;
while ((len = f.Read(buffer, 0, buffer.Length)) &gt; 0)
{
    response.OutputStream.Write(buffer, 0, len);
}
response.OutputStream.Flush();
response.OutputStream.Close();
response.Flush();
response.Close();
response.End();
</pre>]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2011/12/08/how-to-force-download-a-file-using-asp-net/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Pipilika &#8211; The Search Engine of Bangladesh</title><link>http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/</link> <comments>http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/#comments</comments> <pubDate>Mon, 11 Jul 2011 15:59:27 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Blogs]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=364</guid> <description><![CDATA[We are introducing first ever multilingual Web Search Engine Pipilika in Bangladesh both in Bengali and English. This free web service provides search results on nationwide news and business  information  to  the  people  and  for  the  people.  The  focused  crawler  crawls  news data  from  the  popular  daily  newspapers  and  it  indexes  them  in  our  indexer.  [...]]]></description> <content:encoded><![CDATA[<p><a
href="http://www.dscripts.net/wp-content/uploads/2011/07/pipilika.png" rel="wp-prettyPhoto[g364]"><img
class="aligncenter size-medium wp-image-365" title="Pipilika" src="http://www.dscripts.net/wp-content/uploads/2011/07/pipilika-300x194.png" alt="" width="300" height="194" /></a></p><p>We are introducing first ever multilingual Web Search Engine Pipilika in Bangladesh both in Bengali and English. This free web service provides search results on nationwide news and business  information  to  the  people  and  for  the  people.  The  focused  crawler  crawls  news data  from  the  popular  daily  newspapers  and  it  indexes  them  in  our  indexer.  Hence  the search  engine  is  able  to  supply  up-to-date  news,  articles,  reviews  etc.  of  different  news portals. Renowned Search Engines have not paid much attention on Bengali search. So, we tried to give more emphasize on Bengali news analysis and searching.</p><p><span
id="more-364"></span></p><p>More Screenshots:</p><a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/pipilika/' title='Pipilika'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/pipilika-150x150.png" class="attachment-thumbnail" alt="Pipilika" title="Pipilika" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/pipilika-logo/' title='Pipilika Logo'><img
width="150" height="136" src="http://www.dscripts.net/wp-content/uploads/2011/07/pipilika-logo-150x136.png" class="attachment-thumbnail" alt="pipilika" title="Pipilika Logo" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/pip-logo/' title='Pipilika Logo'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/pip.logo_-150x150.png" class="attachment-thumbnail" alt="Pipilika Logo" title="Pipilika Logo" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/pip-logo-2/' title='Pipilika Logo'><img
width="130" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/pip.logo_1-130x150.png" class="attachment-thumbnail" alt="Pipilika Logo" title="Pipilika Logo" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image047/' title='image047'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image047-150x150.jpg" class="attachment-thumbnail" alt="image047" title="image047" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image048/' title='image048'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image048-150x150.jpg" class="attachment-thumbnail" alt="image048" title="image048" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image049/' title='image049'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image049-150x150.png" class="attachment-thumbnail" alt="image049" title="image049" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image050/' title='image050'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image050-150x150.jpg" class="attachment-thumbnail" alt="image050" title="image050" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image051/' title='image051'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image051-150x150.png" class="attachment-thumbnail" alt="image051" title="image051" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image052/' title='image052'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image052-150x150.jpg" class="attachment-thumbnail" alt="image052" title="image052" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image053/' title='image053'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image053-150x150.png" class="attachment-thumbnail" alt="image053" title="image053" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image054/' title='image054'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image054-150x150.jpg" class="attachment-thumbnail" alt="image054" title="image054" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image055/' title='image055'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image055-150x150.png" class="attachment-thumbnail" alt="image055" title="image055" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image056/' title='image056'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image056-150x150.jpg" class="attachment-thumbnail" alt="image056" title="image056" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image057/' title='image057'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image057-150x150.png" class="attachment-thumbnail" alt="image057" title="image057" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image058/' title='image058'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image058-150x150.jpg" class="attachment-thumbnail" alt="image058" title="image058" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image059/' title='image059'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image059-150x150.png" class="attachment-thumbnail" alt="image059" title="image059" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image060/' title='image060'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image060-150x150.jpg" class="attachment-thumbnail" alt="image060" title="image060" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image061/' title='image061'><img
width="150" height="93" src="http://www.dscripts.net/wp-content/uploads/2011/07/image061-150x93.png" class="attachment-thumbnail" alt="image061" title="image061" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image062/' title='image062'><img
width="150" height="121" src="http://www.dscripts.net/wp-content/uploads/2011/07/image062-150x121.jpg" class="attachment-thumbnail" alt="image062" title="image062" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image063/' title='image063'><img
width="150" height="124" src="http://www.dscripts.net/wp-content/uploads/2011/07/image063-150x124.png" class="attachment-thumbnail" alt="image063" title="image063" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image064/' title='image064'><img
width="150" height="112" src="http://www.dscripts.net/wp-content/uploads/2011/07/image064-150x112.jpg" class="attachment-thumbnail" alt="image064" title="image064" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image065/' title='image065'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image065-150x150.png" class="attachment-thumbnail" alt="image065" title="image065" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image066/' title='image066'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image066-150x150.jpg" class="attachment-thumbnail" alt="image066" title="image066" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image067/' title='image067'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image067-150x150.png" class="attachment-thumbnail" alt="image067" title="image067" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image068/' title='image068'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image068-150x150.jpg" class="attachment-thumbnail" alt="image068" title="image068" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image069/' title='image069'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image069-150x150.png" class="attachment-thumbnail" alt="image069" title="image069" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image070/' title='image070'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image070-150x150.jpg" class="attachment-thumbnail" alt="image070" title="image070" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image071/' title='image071'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image071-150x150.png" class="attachment-thumbnail" alt="image071" title="image071" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image072/' title='image072'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image072-150x150.jpg" class="attachment-thumbnail" alt="image072" title="image072" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image073/' title='image073'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image073-150x150.png" class="attachment-thumbnail" alt="image073" title="image073" /></a> <a
href='http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/image074/' title='image074'><img
width="150" height="150" src="http://www.dscripts.net/wp-content/uploads/2011/07/image074-150x150.jpg" class="attachment-thumbnail" alt="image074" title="image074" /></a>]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2011/07/11/pipilika-the-search-engine-of-bangladesh/feed/</wfw:commentRss> <slash:comments>5</slash:comments> </item> <item><title>Group by date from a datetime or timestamp field in Mysql</title><link>http://www.dscripts.net/2010/12/14/group-by-date-from-a-datetime-or-timestamp-field-in-mysql/</link> <comments>http://www.dscripts.net/2010/12/14/group-by-date-from-a-datetime-or-timestamp-field-in-mysql/#comments</comments> <pubDate>Tue, 14 Dec 2010 17:01:34 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[SQL]]></category> <category><![CDATA[date_format]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=82</guid> <description><![CDATA[This is a simple but greatly useful mysql query. You can use it to find all the dates even with time included in datetime or timestamp field. The table REGISTRATION_NO EXAM_ROLL DATE 2010231001 20016 2010-12-03 01:59:38 2010338007 30017 2010-12-01 11:05:48 2010333007 30022 2010-12-01 14:20:19 2010132002 30173 2010-12-01 10:21:02 2010336009 30192 2010-12-01 16:00:35 2010132001 30233 2010-12-01 [...]]]></description> <content:encoded><![CDATA[<p>This is a simple but greatly useful mysql query. You can use it to find all the dates even with time included in datetime or timestamp field.</p><p><span
id="more-82"></span></p><p>The table</p><table
id="table_results" class="data"><thead><tr><th> REGISTRATION_NO</th><th> EXAM_ROLL</th><th> DATE</th></tr></thead><tbody><tr
class="odd"><td
class="nowrap" align="right">2010231001</td><td
class="nowrap" align="right">20016</td><td
class="nowrap">2010-12-03 01:59:38</td></tr><tr
class="even"><td
class="nowrap" align="right">2010338007</td><td
class="nowrap" align="right">30017</td><td
class="nowrap">2010-12-01 11:05:48</td></tr><tr
class="odd"><td
class="nowrap" align="right">2010333007</td><td
class="nowrap" align="right">30022</td><td
class="nowrap">2010-12-01 14:20:19</td></tr><tr
class="even"><td
class="nowrap" align="right">2010132002</td><td
class="nowrap" align="right">30173</td><td
class="nowrap">2010-12-01 10:21:02</td></tr><tr
class="odd"><td
class="nowrap" align="right">2010336009</td><td
class="nowrap" align="right">30192</td><td
class="nowrap">2010-12-01 16:00:35</td></tr><tr
class="even"><td
class="nowrap" align="right">2010132001</td><td
class="nowrap" align="right">30233</td><td
class="nowrap">2010-12-01 10:17:14</td></tr><tr
class="odd"><td
class="nowrap" align="right">2010332042</td><td
class="nowrap" align="right">30239</td><td
class="nowrap">2010-12-01 15:23:39</td></tr><tr
class="even"><td
class="nowrap" align="right">2010331022</td><td
class="nowrap" align="right">30338</td><td
class="nowrap">2010-12-01 11:25:06</td></tr><tr
class="odd"><td
class="nowrap" align="right">2010331024</td><td
class="nowrap" align="right">30352</td><td
class="nowrap">2010-12-01 11:26:33</td></tr><tr
class="even"><td
class="nowrap" align="right">2010334029</td><td
class="nowrap" align="right">30463</td><td
class="nowrap">2010-12-01 13:46:47</td></tr></tbody></table><p>Now we can get the available dates group by excluding time by using the following query</p><pre class="brush: sql; title: ; notranslate">
SELECT date_format(`DATE`, &amp;#39;%Y-%m-%d&amp;#39;) as date FROM `student_admission_info`

                    where `DATE` IS NOT NULL

                    group by date_format(`DATE`, &amp;#39;%Y-%m-%d&amp;#39;)

                    order by date_format(`DATE`, &amp;#39;%Y-%m-%d&amp;#39;) desc</pre><p>Manual for mysql date_format function is here</p><p><a
href="http:// http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-format">http://dev.mysql.com/doc/refman/5.1/en/date-and-time-functions.html#function_date-format</a></p> ]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/12/14/group-by-date-from-a-datetime-or-timestamp-field-in-mysql/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Create custom notification on your Ubuntu desktop using python</title><link>http://www.dscripts.net/2010/09/07/create-custom-notification-on-your-ubuntu-desktop-using-python/</link> <comments>http://www.dscripts.net/2010/09/07/create-custom-notification-on-your-ubuntu-desktop-using-python/#comments</comments> <pubDate>Tue, 07 Sep 2010 07:24:25 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Linux]]></category> <category><![CDATA[Python]]></category> <category><![CDATA[NotifyOSD]]></category> <category><![CDATA[Ubuntu]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=108</guid> <description><![CDATA[You can easily create custom notifications for your applications on ubuntu. Ubuntu&#8217;s current notification system is called NotifyOSD. It provides api for several languages. Using python-notify library it is really so easy than in any other. Ubuntu by default have the python-notify installed. If don&#8217;t install it using Sample notification program using python for notifyOSD [...]]]></description> <content:encoded><![CDATA[<p>You can easily create custom notifications for your applications on ubuntu. Ubuntu&#8217;s current notification system is called NotifyOSD. It provides api for several languages. Using python-notify library it is really so easy than in any other.</p><p>Ubuntu by default have the python-notify installed. If don&#8217;t install it using</p><pre class="brush: bash; title: ; notranslate">
sudo apt-get install python-notify
</pre><p><span
id="more-108"></span></pre><h3>Sample notification program using python for notifyOSD</h3><p>Now you open gedit or any other text editor, paste the following code and save it as notify.py</p><pre class="brush: python; title: ; notranslate">
#!/usr/bin/python
import sys
import pynotify

if __name__ == &quot;__main__&quot;:
	if not pynotify.init(&quot;icon-summary-body&quot;):
		sys.exit(1)

	n = pynotify.Notification(
	    &quot;Burhan Uddin&quot;,
	    &quot;What is life? Full of of care? We have no time to stand or stare!&quot;,
	    &quot;notification-message-im&quot;)
	n.show()
</pre><p>Now run this py file using terminal</p><pre class="brush: bash; title: ; notranslate">
python notify.py
</pre><p>And see the output <img
src='http://www.dscripts.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p><p><a
href="http://www.dscripts.net/wp-content/uploads/2011/04/notifyosd.jpg" rel="wp-prettyPhoto[g108]"><img
class="alignnone size-full wp-image-109" title="NotifyOSD Output" src="http://www.dscripts.net/wp-content/uploads/2011/04/notifyosd.jpg" alt="" width="511" height="231" /></a></p> ]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/09/07/create-custom-notification-on-your-ubuntu-desktop-using-python/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Convert unicode codepoints to unicode hex values in java</title><link>http://www.dscripts.net/2010/09/05/convert-unicode-codepoints-to-unicode-hex-values-in-java/</link> <comments>http://www.dscripts.net/2010/09/05/convert-unicode-codepoints-to-unicode-hex-values-in-java/#comments</comments> <pubDate>Sun, 05 Sep 2010 07:39:45 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Java]]></category> <category><![CDATA[CodePoint]]></category> <category><![CDATA[Unicode]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=113</guid> <description><![CDATA[In a part of our crawler development, we encountered a Bangla news site (http://www.kalerkantho.com/) which uses code points instead of unicode hex valus in their website. Although it renders banla fonts in browser, but when viewings source it only shows code points, so when downloaded by crawler we only got &#38;#2453;&#38;#2709;&#38;#1609; similar.&#160; For the indexing [...]]]></description> <content:encoded><![CDATA[<p> In a part of our crawler development, we encountered a Bangla news site (http://www.kalerkantho.com/) which uses code points instead of unicode hex valus in their website. Although it renders banla fonts in browser, but when viewings source it only shows code points, so when downloaded by crawler we only got &amp;#2453;&amp;#2709;&amp;#1609; similar.&nbsp; For the indexing purpose we needed to convert them to hex values so that it renders bangla font anywhere. The process of converting is really so simple.</p><p><span
id="more-113"></span></p><p> First tokenize the &quot;&amp;#2453;&amp;#2709;&amp;#1609;&quot; strings with &quot;;&quot; and then strip out &quot;&amp;#&quot;, the integer numbers you are getting is the decimal values of hex. Convert these string to integer and then add with &#39;\u0000&#39; now type cast it to char<br
/> and you will get the hex ie unicode character</p><pre class="brush: java; title: ; notranslate">
int codePoint = 2537;
char c = (char) (&amp;#39;\u0000&amp;#39; + codePoint);
</pre>]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/09/05/convert-unicode-codepoints-to-unicode-hex-values-in-java/feed/</wfw:commentRss> <slash:comments>1</slash:comments> </item> <item><title>Extract hyperlinks from html using regular expression in java</title><link>http://www.dscripts.net/2010/09/04/extract-hyperlinks-from-html-using-regular-expression-in-java/</link> <comments>http://www.dscripts.net/2010/09/04/extract-hyperlinks-from-html-using-regular-expression-in-java/#comments</comments> <pubDate>Sat, 04 Sep 2010 07:54:09 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Java]]></category> <category><![CDATA[Crawler]]></category> <category><![CDATA[Regular Expression]]></category> <category><![CDATA[URL Parsing]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=117</guid> <description><![CDATA[2 years ago, I worked with an crawler which can fetch webpages from internet, then parse the links from the page and then visit all the pages linked to the page. At that time I didn&#39;t have any idea about regular expressions. So I had to write around a 500 hundred line code to parse [...]]]></description> <content:encoded><![CDATA[<p>2 years ago, I worked with an crawler which can fetch webpages from internet, then parse the links from the page and then visit all the pages linked to the page. At that time I didn&#39;t have any idea about regular expressions. So I had to write around a 500 hundred line code to parse links and meta tags from html.</p><p>Yesterday, I had to do the same job again. This time I took up regular expression to parse the html &lt;a tags followed by href attribute to extract the links.</p><p>Regular expressions can be difficult to understand if written at once, so I am going to write it in easy way first, then i ll make it complex to support variations in page links.</p><p><span
id="more-117"></span><br
/> <br/></p><h3>1. Parsing page title using regular expression</h3><p>First i parsed html titles from pages with this regex</p><pre class="brush: plain; title: ; notranslate">
&lt;title&gt;(.*?)&lt;/title&gt;
</pre><p>This is a very simple form of regular expression. Which says to find the string &lt;title&gt; then &quot;.&quot; is for any charecter. * means the previous character can repeat 0 or more times. The ? sign means that first to look for 0 matches. &lt;/title&gt; means that there must be the string&lt;/title&gt; after. brackets defines a group of charecters that is any charecter happenning 0 or more times.</p><p><br/></p><h3>1.1 Match anything between &lt;a&gt; and &lt;/a&gt;</h3><pre class="brush: plain; title: ; notranslate">
&lt;a&gt;(*.?)&lt;/a&gt;
</pre><p>Similarly why not try this for &quot;a&quot; tag ??</p><p><br/></p><h3>1.2 Match anything in a tag</h3><p>What we had done in the previous step is just to match the title of a tag</p><p>&lt;a href=&quot;http://www.dscripts.net&quot;&gt;DSCRIPTS&lt;/a&gt; would only return &quot;DSCRIPTS&quot;</p><p>But to strip the real url must access the attributes of a tag. So get in little bit deeper</p><pre class="brush: plain; title: ; notranslate">
&lt;a(.*?)&lt;/a&gt;
</pre><p>This would return anything between &lt;a and &lt;/a&gt; But we actually need anything between &lt;a href write</p><p><br/></p><h3>1.3 Match &lt;a href</h3><pre class="brush: plain; title: ; notranslate">
&lt;a href(.*?)&lt;/a&gt;
</pre><p>This would match <strong>=&quot;http://www.dscripts.net&quot;&gt;DSCRIPTS</strong></p><p><br/></p><pre class="brush: plain; title: ; notranslate">
&lt;a href=(.*?)&lt;/a&gt;
</pre><p>This would match &quot;http://www.dscripts.net&quot;&gt;DSCRIPTS</p><h3>1.4 Match both href tag and link title</h3><p>So far we had the tail &gt;DSCRIPTS. We need to remove it. We can do this by declaring another group. First one will match href attribute and second one will match the title</p><pre class="brush: plain; title: ; notranslate">
&lt;a href=(.*?)&gt;(.*?)&lt;/a&gt;
</pre><p>Here the first group will match the link &quot;http://www.dscripts.net&quot; and the second one will match DSCRIPTS</p><p><br/></p><h3>1.5 Match a href between quotes</h3><p>So far this examples we didn&#39;t search the quotes arround href attribute so it returned &quot;http://www.dscripts.net&quot; not http://www.dscripts.net</p><p>Well this part is little tricy, lets first say whats we are gonna do.</p><p>We need to match for one quote &quot; after href= then followed by any charecter&nbsp; and then another &quot; ??</p><pre class="brush: plain; title: ; notranslate">
href =\&quot;(.*?)\&quot;
</pre><p><strong>Note: quotes required to escaped with backslash \</strong></p><p>Not exactly if we say any character then it will also match &quot; so it will never catch the trailing &quot; on that tag. Rather catch the last quote in last a tag in document <img
src='http://www.dscripts.net/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' /></p><p>&lt;a href=<span
style="background-color: rgb(255, 0, 0);">&quot;</span><span
style="background-color: rgb(255, 255, 0);">link1.html</span><span
style="background-color: rgb(255, 255, 0);">&quot;&gt;LINK 1&lt;/a&gt;</span></p><p><span
style="background-color: rgb(255, 255, 0);">&lt;a </span><span
style="background-color: rgb(255, 255, 0);">href</span><span
style="background-color: rgb(255, 255, 0);">=&quot;</span><span
style="background-color: rgb(255, 255, 0);">link2.html</span><span
style="background-color: rgb(255, 255, 0);">&quot;&gt;LINK 2&lt;/a&gt;</span></p><p><span
style="background-color: rgb(255, 255, 0);">&lt;a </span><span
style="background-color: rgb(255, 255, 0);">href</span><span
style="background-color: rgb(255, 255, 0);">=&quot;</span><span
style="background-color: rgb(255, 255, 0);">link3.html</span><span
_fck_bookmark="1" style="display: none;">&nbsp;</span><span
style="background-color: rgb(255, 0, 0);">&quot;</span><span
_fck_bookmark="1" style="display: none;">&nbsp;</span>&gt;LINK 3&lt;/a&gt;</p><p>So this will return the yellow marked result. Do you really want that :p</p><p>Surely ans is no.</p><p>So in our expression we must say not to have any &quot; between preceding and trailing quotes. Right ?</p><p>Here we will write one expression that will define to match any charecters exlcules a set of charecters we set</p><p>[^\&quot;]</p><p>This defines a set of charecter which is true for anything else &quot;. and we can have this any number of time</p><p>[^\&quot;]</p><p>So we rewrite our expression as follow</p><pre class="brush: plain; title: ; notranslate">
&lt;a href=\&quot;([^\&quot;]*)\&quot;&gt;(.*?)&lt;/a&gt;
</pre><p><br/></p><h3>1.6 What if the a tag has some whitespaces arround??</h3><p>Here we have used a very neat html here. it will match anything like</p><p>&lt;a href=&quot;http://www.yoursite1.com&quot;&gt;Site 1&lt;/a&gt;</p><p>&lt;a href=&quot;http://www.yoursite2.com&quot;&gt;Site 2&lt;/a&gt;</p><p>anything similar.</p><p>But what if ??</p><p>&lt;a href<span
style="background-color: rgb(0, 255, 0);"> </span>=<span
style="background-color: rgb(0, 255, 0);"> </span>&quot;http://www.yoursite.com&quot;<span
style="background-color: rgb(0, 255, 0);"> </span>&gt;Site&lt;/a&gt;</p><p>or</p><p>&lt;a&nbsp;<span
style="background-color: rgb(0, 255, 0);">&nbsp;&nbsp;&nbsp; </span>href<span
style="background-color: rgb(0, 255, 0);">&nbsp;&nbsp;&nbsp; </span>=<span
style="background-color: rgb(0, 255, 0);">&nbsp;&nbsp; </span>&quot;http://www.yoursite.com&quot;<span
style="background-color: rgb(0, 255, 0);">&nbsp;&nbsp;&nbsp; </span>&gt;Site&lt;/a&gt;</p><p>This not gonna match <img
src='http://www.dscripts.net/wp-includes/images/smilies/icon_sad.gif' alt=':(' class='wp-smiley' /></p><p>so we need to modify our expression to allow this whitespaces</p><p>whitespaces are denoted as \s in regular expression.</p><p>So we modify our expression as</p><pre>
&lt;a <span style="background-color: rgb(0, 255, 0);">\s*?</span>href<span style="background-color: rgb(0, 255, 0);">\s*</span>=<span style="background-color: rgb(0, 255, 0);">\s*</span>\&quot;([^\&quot;]*)\&quot;<span style="background-color: rgb(0, 255, 0);">\s*</span>&gt;(.*?)&lt;/a&gt;
</pre><p>So now it can match these criterias.</p><p>&nbsp;</p><h3>1.7 What if the tag has some more attribute.</h3><p>Sure you cant say that a tag will have only href attribute! What if it is anything like these below</p><p>&lt;a class=&quot;my-class&quot; href=&quot;http://www.dscirpt.net&quot;&gt;DSCRIPTS&lt;/a&gt;</p><p>&lt;a id=&quot;HOME&quot; class=&quot;my-class&quot; href=&quot;http://www.dscirpt.net&quot;&gt;DSCRIPTS&lt;/a&gt;</p><p>&lt;a class=&quot;my-class&quot; href=&quot;http://www.dscirpt.net&quot; id=&quot;HOME&quot; &gt;DSCRIPTS&lt;/a&gt;</p><p>&lt;a href=&quot;http://www.dscirpt.net&quot; id=&quot;HOME&quot; class=&quot;my-class&quot; &gt;DSCRIPTS&lt;/a&gt;</p><p>&nbsp;</p><p>In most case you are going to face hyperlinks like this.</p><p>So we must also think about the other attributes before and after href attribute. They can either many or none in either side. :s</p><p>So on both side of href we can have any characters except &gt; what means the end of start tag and start to link title</p><pre>
&lt;a<span style="background-color: rgb(0, 255, 0);">\s</span><span style="background-color: rgb(255, 160, 122);">[^&gt;]*</span>href\s*=\s*\&quot;([^\&quot;]*)\&quot;<span style="background-color: rgb(255, 160, 122);">[^&gt;]*</span>&gt;(.*?)&lt;/a&gt;
</pre><p>&nbsp;</p><p>As you can see I added [^&gt;]* on both sides of href tag to say we can have anything else &gt; which is the end of start tag. So by using this we are catching all attributes wether they exists or not on both side of href attribute</p><p>\s is an optional replacement of &quot; &quot; (space) which defines there must be one white space immediately after &lt;a</p><p>&nbsp;</p><p>Now take a look at the final expression once again&#8230;</p><p>&nbsp;</p><p>&nbsp;</p><pre>
&lt;a            Must have &lt;a
\s            Must have one whitespace
[^&gt;]*         Can have anything except &gt; and can happen 0 or more times (Any attribute)
href          Must have href
\s*           May have whitespace for 0 or more times
=             Must have =
\s*           May have whitespace for 0 or more times
\&quot;            Must have one &quot;
  (             Start of first group
    [^\&quot;]*        Can have any charecter except &quot; and can repeat 0 or more times
  )             End of first group
\&quot;            Must have one &quot;
[^&gt;]*         Can have anything except &gt; and can happen 0 or more times (Any attribute)
&gt;             Must have &gt;
  (             Start of second group
    .*?           can have any charecter 0 or more times
  )             End of second group
&lt;/a&gt;          Must have end tag
</pre><p>&nbsp;</p><p>&nbsp;</p><h3>Drawbacks</h3><p>Although I tried to make it effecient as much as possible, but yet its already have so many drawbacks. One of the most important thing is not to detect href with single quote</p><p>&lt;a href=<strong><span
style="color: rgb(255, 0, 0);">&#39;</span></strong>http://www.dscripts.net<strong><span
style="color: rgb(255, 0, 0);">&#39;</span></strong>&gt;DSCRIPTS&lt;/a&gt;</p><p>It also case sensetive meaning it can only detect lowercase tags not</p><p>&lt;A HREF=&#39;http://www.dscripts.net&#39;&gt;DSCRIPTS&lt;/a&gt;</p><p>I detected end of attributes with &gt; but what about this</p><p>&lt;a class=&quot;broker-class&quot; title=&quot;this will break &gt; regular expression&quot; href=<strong><span
style="color: rgb(255, 0, 0);">&#39;</span></strong>http://www.dscripts.net<strong><span
style="color: rgb(255, 0, 0);">&#39;</span></strong>&gt;DSCRIPTS&lt;/a&gt;</p><p>These are still not clear to me. I hope to update this expression as much as possible</p><p>&nbsp;</p><p>&nbsp;</p><h2>Regular Expressions in Java</h2><p>Now we have the expression now we are going to implement this on java.</p><p>To use regular expession in java we mainly use to classes Pattern and Matcher from java.util.regex package.</p><p>Pattern is responsible for creating an compiled object of regula expression from a string. And matcher is responsible for matching it with input data.</p><blockquote><p><strong>Note</strong> : In regular expressions we used backslash for special purposes. When writting regular expression as java string we need to escape this backslashes with another backslash.</p></blockquote><p>Here is simple parser class used to parse links from web page, I used an hashmap instead of array because i do not need to have the repeative links from the page. I also used the hashmap to count the occurences of links.</p><pre class="brush: java; title: ; notranslate">
/*
 * To change this template, choose Tools | Templates
 * and open the template in the editor.
 */
package crawler;

import java.util.HashMap;
import java.util.regex.Matcher;
import java.util.regex.Pattern;

/**
 *
 * @author burhan
 */
public class parser {
    // regular expression for parsing links
    private String regex_links = &amp;quot;&amp;lt;a\\s[^&amp;gt;]*href\\s*=\\s*[\&amp;quot;\&amp;#39;]?([^\&amp;quot;\&amp;#39; ]*)[\&amp;quot;\&amp;#39;]?[^&amp;gt;]*&amp;gt;(.*)&amp;lt;/a&amp;gt;&amp;quot;;
    //hash map to store the links
    private HashMap&amp;lt;String, Integer&amp;gt; link_map;

    public void parse(String data) {
        // create pattern object
        Pattern p = Pattern.compile(regex_links);
        // create mather object
        Matcher m = p.matcher(data);

        String link = null;
        link_map = new HashMap();

        // search the input strings
        while (m.find()) {
            // find links which is in group 1
            link = m.group(1);
           // check if hasmap already contains the link or not
            if(link_map.containsKey(link))
                link_map.put(link, link_map.get(link)+1); // set count +1
            else
                link_map.put(link, 1); // set count 1
        }
    }

   //returns the links
    public HashMap&amp;lt;String, Integer&amp;gt; getLinks(){
        return link_map;
    }

}
</pre>]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/09/04/extract-hyperlinks-from-html-using-regular-expression-in-java/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>How to allow users to execute sudo command in ubuntu</title><link>http://www.dscripts.net/2010/08/19/how-to-allow-users-to-execute-sudo-command-in-ubuntu/</link> <comments>http://www.dscripts.net/2010/08/19/how-to-allow-users-to-execute-sudo-command-in-ubuntu/#comments</comments> <pubDate>Thu, 19 Aug 2010 08:46:09 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Linux]]></category> <category><![CDATA[Bash]]></category> <category><![CDATA[Shell]]></category> <category><![CDATA[Ubuntu]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=143</guid> <description><![CDATA[If you have a user account that cannot make sudo command then it because it is not in sudoers list. You must add the user to sudoers list to allow sudo command. Sudoers list is located in /etc/sudoers &#160; So open up the file with visudo command. You must use this command to edit this [...]]]></description> <content:encoded><![CDATA[<p>If you have a user account that cannot make sudo command then it because it is not in sudoers list. You must add the user to sudoers list to allow sudo command. Sudoers list is located in /etc/sudoers</p><p> &nbsp;</p><p>So open up the file with visudo command. You must use this command to edit this</p><p><span
id="more-143"></span></p><pre class="brush: bash; title: ; notranslate">
burhan@burhan-desktop:$ sudo visudo
</pre><p> If you want to set user permission like root user add it below</p><pre class="brush: bash; title: ; notranslate">
# User privilege specification
root    ALL=(ALL) ALL
</pre><p> &nbsp;</p><p> like <strong>username ALL=(ALL) ALL</strong></p><p> This will allow user do anything like root</p><p> &nbsp;</p><p> Here I would like to add user to sudo list so that he make sudo command with password promt.</p><p> I want to add all users of nutch group in sudo users list. So i changed it to something like this</p><pre class="brush: bash; title: ; notranslate">
# Allow members of group sudo to execute any command after they have
# provided their password
# (Note that later entries override this, so you might need to move
# it further down)
%sudo ALL=(ALL) ALL
%nutch ALL=(ALL) ALL
</pre>]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/08/19/how-to-allow-users-to-execute-sudo-command-in-ubuntu/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>How to add users and usergroup in ubuntu using terminal</title><link>http://www.dscripts.net/2010/08/14/how-to-add-users-and-usergroup-in-ubuntu-using-terminal/</link> <comments>http://www.dscripts.net/2010/08/14/how-to-add-users-and-usergroup-in-ubuntu-using-terminal/#comments</comments> <pubDate>Sat, 14 Aug 2010 08:20:06 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Linux]]></category> <category><![CDATA[Bash]]></category> <category><![CDATA[Ubuntu]]></category> <category><![CDATA[User Mangement]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=129</guid> <description><![CDATA[It is very easy to create new user account in ubuntu using terminal. Just follow adduser and addgroup command. Here is an example I want to create a user named &#34;nutch&#34; with a new &#34;nutch&#34; user group. Here is how I did it. First create the user group &#160; Now create the user account within [...]]]></description> <content:encoded><![CDATA[<p> It is very easy to create new user account in ubuntu using terminal. Just follow adduser and addgroup command.</p><p>Here is an example</p><p>I want to create a user named &quot;nutch&quot; with a new &quot;nutch&quot; user group. Here is how I did it.</p><p><span
id="more-129"></span></p><p>First create the user group</p><pre class="brush: bash; title: ; notranslate">
burhan@burhan-desktop:/usr/share$ sudo addgroup nutch
Adding group `nutch&amp;#39; (GID 1002) ...
Done.
</pre><p> &nbsp;</p><p> Now create the user account within the user group</p><pre class="brush: bash; title: ; notranslate">
burhan@burhan-desktop:/usr/share$ sudo adduser nutch --ingroup nutch
Adding user `nutch&amp;#39; ...
Adding new user `nutch&amp;#39; (1002) with group `nutch&amp;#39; ...
Creating home directory `/home/nutch&amp;#39; ...
Copying files from `/etc/skel&amp;#39; ...
Enter new UNIX password:
Retype new UNIX password:
passwd: password updated successfully
Changing the user information for nutch
Enter the new value, or press ENTER for the default
	Full Name []: NUTCH
	Room Number []: NUTCH
	Work Phone []: NUTCH
	Home Phone []: NUTCH
	Other []: NUTCH
Is the information correct? [Y/n] y
burhan@burhan-desktop:/usr/share$ su nutch
Password:
Added user nutch.

nutch@burhan-desktop:/usr/share$
</pre>]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/08/14/how-to-add-users-and-usergroup-in-ubuntu-using-terminal/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Change root user password in ubuntu using terminal (The short way)</title><link>http://www.dscripts.net/2010/08/14/change-root-user-password-in-ubuntu-using-terminal-the-short-way/</link> <comments>http://www.dscripts.net/2010/08/14/change-root-user-password-in-ubuntu-using-terminal-the-short-way/#comments</comments> <pubDate>Sat, 14 Aug 2010 08:04:44 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Linux]]></category> <category><![CDATA[Bash]]></category> <category><![CDATA[Shell]]></category> <category><![CDATA[Ubuntu]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=121</guid> <description><![CDATA[If you forget the root user password you can change with a simple tricky command in ubuntu. Login to your regular account. You should have the sudo comman available for your user account. Now open the terminal and enter the command This will let you to change root account password as you used sudo! Now [...]]]></description> <content:encoded><![CDATA[<p>If you forget the root user password you can change with a simple tricky command in ubuntu. Login to your regular account. You should have the sudo comman available for your user account.</p><p>Now open the terminal and enter the command</p><pre class="brush: bash; title: ; notranslate">
sudo passwd
</pre><p>This will let you to change root account password as you used sudo! Now enter the new password and confirm. Yes! You are done <img
src='http://www.dscripts.net/wp-includes/images/smilies/icon_smile.gif' alt=':)' class='wp-smiley' /></p> ]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/08/14/change-root-user-password-in-ubuntu-using-terminal-the-short-way/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> <item><title>Run cron from external server when you dont have cron support!</title><link>http://www.dscripts.net/2010/07/27/run-cron-from-external-server-when-you-dont-have-cron-support/</link> <comments>http://www.dscripts.net/2010/07/27/run-cron-from-external-server-when-you-dont-have-cron-support/#comments</comments> <pubDate>Tue, 27 Jul 2010 14:37:43 +0000</pubDate> <dc:creator>Burhan Uddin</dc:creator> <category><![CDATA[Blogs]]></category> <category><![CDATA[Cron]]></category> <category><![CDATA[Scheduled Task]]></category> <category><![CDATA[Tips & Tricks]]></category><guid
isPermaLink="false">http://www.dscripts.net/?p=169</guid> <description><![CDATA[Cron jobs are used to execute scheduled tasks on UNIX web servers. It is similar to windows scheduled task. Using it you can performs lots of required tasks for your website or application like monthly bill payments check, updated check, clearing cache or temporary files and lots more. In linux it is set using crontab [...]]]></description> <content:encoded><![CDATA[<p>Cron jobs are used to execute scheduled tasks on UNIX web servers. It  is similar to windows scheduled task. Using it you can performs lots of  required tasks for your website or application like monthly bill  payments check, updated check, clearing cache or temporary files and  lots more.</p><p>In linux it is set using crontab command. If you have cpanel then it  will also provide you an user interface to manage your cron jobs. Most  paid hosting support these. But in most cases it is not available on  free hosts. But still if you need this, to maintain a regular check or  execution of something and you dont have this feature provided by your  host, you can still manage to do so&#8230;</p><p><span
id="more-169"></span></p><h3>You can be cron less but don&#8217;t get hopeless!!!</h3><p>Fortunately some providers are providing free service of cron jobs on  internet. Though they have lots of limitations on time and how often you  can run cron, but still you can manage to do so Here is two I have  found by googling..</p><h3>1. <a
href="http://www.setcronjob.com/">http://www.setcronjob.com/</a></h3><p>This will allow you to run free cron jobs at minimum of 10 minutes  interval. But they have set a resource point for each time you run the  cron. Resource point for each cron depends on how often you run cron.  When you are out of resource point you cannot add any more cron jobs to  your account.</p><h3>2. <a
href="http://www.cronless.com/">http://www.cronless.com</a></h3><p>This will allow you run as many cron jobs as you want but not more than  twice a day. That is minimum interval of running your cron jobs should  be atleast 12 hours</p> ]]></content:encoded> <wfw:commentRss>http://www.dscripts.net/2010/07/27/run-cron-from-external-server-when-you-dont-have-cron-support/feed/</wfw:commentRss> <slash:comments>0</slash:comments> </item> </channel> </rss>

<!-- W3 Total Cache: Minify debug info:
Engine:             disk: basic
Theme:              41004
Template:           index
-->
<!-- Performance optimized by W3 Total Cache. Learn more: http://www.w3-edge.com/wordpress-plugins/

Page Caching using disk: enhanced

Served from: www.dscripts.net @ 2012-02-05 18:03:45 -->
