tag:blogger.com,1999:blog-74592595509766709342024-03-28T12:47:01.442+05:30Technical blogNishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.comBlogger20125tag:blogger.com,1999:blog-7459259550976670934.post-43366028071478845382016-12-02T14:46:00.000+05:302017-01-03T14:35:48.954+05:30Agile In Data Science and Analytics<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<a href="https://www.blogger.com/blogger.g?blogID=7459259550976670934" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Is Agile an effective way to herd the data scientists into
the production pen or just an excuse to avoid documentation and planning?</span><span style="font-size: 7pt; font-stretch: normal; font-variant-numeric: normal; line-height: normal;"> </span><span style="font-size: 10pt; line-height: 107%; text-indent: -18pt;">What
components in Agile do we recommend for Analytics PoCs and full-fledged
projects? </span><br />
<span style="font-size: 10pt; line-height: 107%; text-indent: -18pt;"><br /></span>
<span style="font-size: 10pt; line-height: 107%; text-indent: -18pt;">So let's discuss about it.</span><br />
<span style="font-size: 10pt; line-height: 107%; text-indent: -18pt;"><br /></span>
<span style="font-size: 10pt;">Every organization starts with the ambitions of business and
further creates roadmap of technology, people and investment needed to unlock
that business potential. To unlock the objective, we go through the phase of
initial discussions, understand the requirements, technical workloads like – “I
need a Linux server, database, recommendation engine, tools to handle the big
data...”</span><br />
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Technical requirements are quite straightforward most of the
times, but analytical activity is quite vague and there is uncertainty as we
don’t know what can be the best approach to solve the problem, the amount of
time to get the best solution. <o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">If we develop it in traditional waterfall model approach, how
it will go:<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div class="MsoNormal">
<b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Developing
a traditional analytics project:<o:p></o:p></span></b></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Let’s say we need to build a recommendation engine for users.
Use case seems pretty easy. A traditional analytics team would go endlessly
building an engine by which will use the entire user data, run CBR(content
based recommendation) or CF(Collaborative Filtering), and after a long effort
possibly providing a powerful recommendation engine which can provide near real
time recommendation to the users. In the
entire hassle free development, there was no interaction with business people. <o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div class="MsoNormal">
<b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Challenges
in Traditional Approach</span></b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">: </span></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">We developed the entire engine but are not sure
about the correctness of the model. What if, we used wrong data, or wrong
variables? We don’t even know if our data exploration and insights were
correct? Oops, assume stakeholders reject it and give the feedback for existing
model, as it didn’t meet the expectations. Let’s rework now. Wouldn’t it be
awesome if we could have used Agile before?<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Agile approach would have played a great role here, rapid and
iterative product development and getting rapid customer feedback cycles.<o:p></o:p></span></div>
<div class="MsoNormal">
</div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Now our problem and opportunity come at the interaction of
two trends: how we can incorporate data science and analytics, which is applied
research and needs exhaustive effort on an unpredictable timeline, into the
agile application? How can analytics applications do better than traditional
waterfall approach model? How can we craft application for unknown, evolving
data models? <o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"></span></div>
<div class="MsoNormal">
<!--[if gte vml 1]><v:shapetype id="_x0000_t75" coordsize="21600,21600"
o:spt="75" o:preferrelative="t" path="m@4@5l@4@11@9@11@9@5xe" filled="f"
stroked="f">
<v:stroke joinstyle="miter"/>
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0"/>
<v:f eqn="sum @0 1 0"/>
<v:f eqn="sum 0 0 @1"/>
<v:f eqn="prod @2 1 2"/>
<v:f eqn="prod @3 21600 pixelWidth"/>
<v:f eqn="prod @3 21600 pixelHeight"/>
<v:f eqn="sum @0 0 1"/>
<v:f eqn="prod @6 1 2"/>
<v:f eqn="prod @7 21600 pixelWidth"/>
<v:f eqn="sum @8 21600 0"/>
<v:f eqn="prod @7 21600 pixelHeight"/>
<v:f eqn="sum @10 21600 0"/>
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
<o:lock v:ext="edit" aspectratio="t"/>
</v:shapetype><v:shape id="Picture_x0020_1" o:spid="_x0000_s1026" type="#_x0000_t75"
alt="http://larocke.com/wp-content/uploads/agile-methodology.png" style='position:absolute;
margin-left:309pt;margin-top:8.05pt;width:211.5pt;height:156.75pt;z-index:251658240;
visibility:visible;mso-wrap-style:square;mso-width-percent:0;
mso-height-percent:0;mso-wrap-distance-left:9pt;mso-wrap-distance-top:0;
mso-wrap-distance-right:9pt;mso-wrap-distance-bottom:0;
mso-position-horizontal:absolute;mso-position-horizontal-relative:text;
mso-position-vertical:absolute;mso-position-vertical-relative:text;
mso-width-percent:0;mso-height-percent:0;mso-width-relative:page;
mso-height-relative:page'>
<v:imagedata src="file:///C:\Users\357677\AppData\Local\Temp\msohtmlclip1\01\clip_image001.png"
o:title="agile-methodology"/>
</v:shape><![endif]--><!--[if !vml]--><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmsH6J5mBxyS9Jkxz_1-TnoYxQaUMo0teEWxSI0r8pWbJXl93MNIzMlRszmcIiRU5afgP__m0KFsM1TjC9fS8uzI2UwZONm8vKtuBpGQPjik3oyro3SwV8UueTUWO3ofo3KDj0XTKASY_-/s1600/image1.png" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img border="0" height="296" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgmsH6J5mBxyS9Jkxz_1-TnoYxQaUMo0teEWxSI0r8pWbJXl93MNIzMlRszmcIiRU5afgP__m0KFsM1TjC9fS8uzI2UwZONm8vKtuBpGQPjik3oyro3SwV8UueTUWO3ofo3KDj0XTKASY_-/s400/image1.png" width="400" /></a></div>
<span style="height: 209px; margin-left: 412px; margin-top: 11px; mso-ignore: vglayout; position: absolute; width: 282px; z-index: 251658240;"><br /></span><!--[endif]--><b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">What is Agile<o:p></o:p></span></b></div>
<div class="MsoNormal">
<b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><br /></span></b></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Agile
Software development focuses on the four values(from </span><span style="font-size: 10pt;">Agile
Manifesto):</span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
</div>
<ul style="text-align: left;">
<li><span style="font-size: 13.3333px;"><b>Individuals and Interactions</b> over process and tools</span></li>
<li><span style="font-size: 13.3333px;"><b>Working software</b> over comprehensive documentation</span></li>
<li><span style="font-size: 13.3333px;"><b>Customer collaboration</b> over contract negotiation</span></li>
<li><span style="font-size: 13.3333px;"><b>Responding to change </b>over following a plan</span></li>
</ul>
<div>
<span style="font-size: 10pt;">Engineering
Products and Engineering data science, both are different as</span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">data
science is less deterministic. It needs lots of creativity and though <o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">process to
derive the best approach. Agile helps to
manage those in the<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">cycles,
where team explore</span><span style="font-size: 10.0pt; line-height: 107%;">,
learn something about the data, share the<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%;">insights with the business
team/stakeholders, align the needs and approach, take the feedback and start in the same
direction. <o:p></o:p></span></div>
<span style="font-family: "calibri" , sans-serif; font-size: 10.0pt; line-height: 107%;"><br /></span>
<span style="font-family: "calibri" , sans-serif; font-size: 10.0pt; line-height: 107%;"></span><br />
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">How Agile Analytics approach unfolds<o:p></o:p></span></b></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><br /></span></b></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">The main
difference from traditional to Agile analytics approach is using iterative
process, sharing the learnings with stakeholders, getting rapid feedbacks and
learn with new business questions and describing datasets.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">A team of Data
scientists, Business analysts and other SMEs work with the stakeholders to
discuss each question until they have:</span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
</div>
<ul style="text-align: left;">
<li><span style="font-size: 13.3333px;">The clear and as narrow as possible scope</span></li>
<li><span style="font-size: 13.3333px;">Potential datasets and variables to be used for analysis</span></li>
<li><span style="font-size: 13.3333px;">Questions to be answered</span></li>
</ul>
<span style="font-size: 10pt;">Data
scientists provide the insights on the nature and quality of the dataset, hone
the questions, hypothesis, and provide a concrete list of algorithms that can
be viable to answer those questions. These outputs turn into Proof of concepts
or prototypes of an analytics solution.</span><br />
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">It is a voyoge of discovery. The below structure known as data-value pyramid explains that.</span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4mIRdeWyV_kW5EadmA3BU9nCXncWJb5vYxCI4ZYIwwKPvfUnr0jFQId82KF1MbR9o7tyEvDuRsR4oFTROFadWuezqytxOgjWWh9KCgI64AuyQLlH1Vey8SbXn4tR_yxfMPERrnmwUrz7x/s1600/image2.png.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="240" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEj4mIRdeWyV_kW5EadmA3BU9nCXncWJb5vYxCI4ZYIwwKPvfUnr0jFQId82KF1MbR9o7tyEvDuRsR4oFTROFadWuezqytxOgjWWh9KCgI64AuyQLlH1Vey8SbXn4tR_yxfMPERrnmwUrz7x/s640/image2.png.jpg" width="640" /></a></div>
<div>
<span style="font-family: "calibri" , sans-serif; font-size: 10.0pt; line-height: 107%;"><br /></span></div>
<div>
<span style="font-size: 10pt;">Every
project needs an investment. And building Analytics solution is generally costlier
than developing application software. As each business silo can point to a
different domain or different data source. There is high risk in the
investment.</span><br />
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">
<span style="font-size: 10pt; line-height: 107%;">Agile
Analytics helps to minimize the risk of pursuing the blind alleys. With the
iterative approach, cyclic interaction with business team, it mitigates the
risk of implementing models which turns out to be garbage.</span></span></div>
<div>
<span style="font-family: "calibri" , sans-serif; font-size: 10.0pt; line-height: 107%;"><span style="font-size: 10pt; line-height: 107%;"><br /></span></span></div>
<div>
<b><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">References</span></b><span style="font-size: 10pt; line-height: 107%;">:</span></div>
<div>
<ul style="text-align: left;">
<li><span style="font-size: 13.3333px;">Agile
Data Science : Building Data Analytics applications Book</span></li>
<li><span style="font-size: 13.3333px;"><div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -18.0pt;">
<span style="font-family: "symbol"; font-size: 10.0pt; line-height: 107%;"><span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal; line-height: normal;"> </span></span><!--[endif]--><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><a href="http://agilemanifesto.org/">http://agilemanifesto.org/</a> </span></div>
</span></li>
<li><span style="font-size: 13.3333px;"><div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -18.0pt;">
<span style="font-family: "symbol"; font-size: 10.0pt; line-height: 107%;"><span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal; line-height: normal;"> </span></span><!--[endif]--><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><a href="https://www.oreilly.com/ideas/using-agile-development-techniques-for-data-science-projects">https://www.oreilly.com/ideas/using-agile-development-techniques-for-data-science-projects</a></span><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"> <o:p></o:p></span></div>
</span></li>
<li><div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -18.0pt;">
<span style="font-family: "symbol"; font-size: 10.0pt; line-height: 107%;"><span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal; line-height: normal;"> </span></span><!--[endif]--><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><a href="https://www.thoughtworks.com/insights/blog/introducing-agile-analytics">https://www.thoughtworks.com/insights/blog/introducing-agile-analytics</a></span><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"> <o:p></o:p></span></div>
</li>
<li><div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -18.0pt;">
<span style="font-family: "symbol"; font-size: 10.0pt; line-height: 107%;"><span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal; line-height: normal;"> </span></span><!--[endif]--><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><a href="http://data-informed.com/benefits-agile-analytics-development-right/">http://data-informed.com/benefits-agile-analytics-development-right/</a></span><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"> <o:p></o:p></span></div>
</li>
<li><div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -18.0pt;">
<span style="font-family: "symbol"; font-size: 10.0pt; line-height: 107%;"><span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal; line-height: normal;"> </span></span><!--[endif]--><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><a href="http://ptgmedia.pearsoncmg.com/images/9780321504814/samplepages/032150481X.pdf">http://ptgmedia.pearsoncmg.com/images/9780321504814/samplepages/032150481X.pdf</a></span></div>
</li>
</ul>
</div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><o:p></o:p></span></div>
</div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com6tag:blogger.com,1999:blog-7459259550976670934.post-20676758670568480222016-06-03T18:14:00.000+05:302016-06-03T18:14:27.107+05:30How to write Spark jobs in Java for Spark Job Server<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
In the previous <a href="http://nishutayaltech.blogspot.in/2016/05/how-to-run-spark-job-server-and-spark.html">post</a>,
we learnt about setting up Spark job server, and running the spark jobs. So far, we have used Scala programs to run on job server. Now
we’ll see, how to write the Spark jobs in java to run on job server.</div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
As in Scala, job must implement the <b>SparkJob</b> trait. So the job looks like this:<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div style="background: #EAF1DD; border: solid windowtext 1.0pt; mso-background-themecolor: accent3; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 1.0pt 1.0pt 1.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">object SampleJob extends SparkJob {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"> override def runJob(sc:SparkContext, jobConfig:
Config): Any = ???<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"> override def validate(sc:SparkContext, config:
Config): SparkJobValidation = ???<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">}</span></div>
</div>
<div style="font-weight: bold;">
<b><br /></b></div>
What these methods are:<ul type="disc">
<li class="MsoNormal"><b>runJob </b>method contains
the implementation of the Job. The SparkContext is managed by the
JobServer and will be provided to the job through this method. This
relieves the developer from the boiler-plate configuration management that
comes with the creation of a Spark job and allows the Job Server to manage
and re-use contexts.</li>
</ul>
<ul type="disc">
<li class="MsoNormal"><b>validate</b> method allows for an
initial validation of the context and any provided configuration. If the
context and configuration are OK to run the job, returning <b>spark.jobserver.SparkJobValid</b> will
let the job execute, otherwise returning <b>spark.jobserver.SparkJobInvalid(reason)</b> prevents the job
from running and provides means to convey the reason of failure. In this
case, the call immediately returns an HTTP/1.1 400 Bad
Request status code.<br />
validate helps preventing running jobs that will eventually fail
due to missing or wrong configuration and save both time and resources.<o:p></o:p></li>
</ul>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
In Java, we need to extend <a href="https://github.com/spark-jobserver/spark-jobserver/blob/a8805815585d384253ffbb1712bc2a25c0664b68/job-server-api/src/spark.jobserver/JavaSparkJob.scala">JavaSparkJob</a>
class. It has following methods which will be overridden in the program:<o:p></o:p></div>
<div class="MsoNormal">
</div>
<span class="pl-en"><ul style="text-align: left;">
<li><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;">runJob</span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">(</span><span class="pl-v"><span style="background: white; color: #ed6a43; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">jsc</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">: </span><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">JavaSparkContext</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">, </span><span class="pl-v"><span style="background: white; color: #ed6a43; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">jobConfig</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">: </span><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">Config</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">)</span></li>
<li><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;">validate</span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">(</span><span class="pl-v"><span style="background: white; color: #ed6a43; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">sc</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">: </span><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">SparkContext</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">, </span><span class="pl-v"><span style="background: white; color: #ed6a43; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">config</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">: </span><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">Config</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">)</span></li>
<li><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;">invalidate</span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">(</span><span class="pl-v"><span style="background: white; color: #ed6a43; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">jsc</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">: </span><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">JavaSparkContext</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">, </span><span class="pl-v"><span style="background: white; color: #ed6a43; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">config</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">: </span><span class="pl-en"><span style="background: white; color: #795da3; font-family: Consolas; font-size: 9.0pt; line-height: 115%;"><span style="box-sizing: border-box;">Config</span></span></span><span style="background: white; color: #333333; font-family: Consolas; font-size: 9pt; line-height: 115%;">)</span></li>
</ul>
</span><br />
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal">
JavaSparkJob class is available in <b>job-server-api</b> package. Build
the <a href="https://github.com/spark-jobserver/spark-jobserver/tree/master/job-server-api">job-server-api</a>
source code and add this jar to your project. Add spark and other required dependencies in
your pom.xml. </div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Let’s start with the basic WordCount example:</div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-size: large;">WordCount.java:</span></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div style="background: #EAF1DD; border: solid windowtext 1.0pt; mso-background-themecolor: accent3; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 1.0pt 1.0pt 1.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">package</span></b><span style="font-family: Consolas; font-size: 9pt;"> spark.jobserver;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> java.io.Serializable;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> java.util.Arrays;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> java.util.List;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> java.util.regex.Pattern;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> org.apache.commons.lang.StringUtils;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> org.apache.spark.SparkContext;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;">
org.apache.spark.api.java.JavaPairRDD;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> org.apache.spark.api.java.JavaRDD;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;">
org.apache.spark.api.java.JavaSparkContext;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;">
org.apache.spark.api.java.function.FlatMapFunction;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;">
org.apache.spark.api.java.function.Function2;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;">
org.apache.spark.api.java.function.PairFunction;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> scala.Tuple2;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> spark.jobserver.JavaSparkJob;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> spark.jobserver.SparkJobInvalid;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> spark.jobserver.SparkJobValid$;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> spark.jobserver.SparkJobValidation;</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">import</span></b><span style="font-family: Consolas; font-size: 9pt;"> com.typesafe.config.Config;</span></div>
<pre style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; padding: 0in;"><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">public</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">class</span></b><span style="font-family: Consolas; font-size: 9pt;"> Wordcount </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">extends</span></b><span style="font-family: Consolas; font-size: 9pt;"> JavaSparkJob </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">implements</span></b><span style="font-family: Consolas; font-size: 9pt;"> Serializable {</span><span style="font-family: Consolas; font-size: 9pt;"> </span></pre>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">private</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">static</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">final</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">long</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><i><span style="color: #0000c0; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">serialVersionUID</span></i><span style="font-family: Consolas; font-size: 9pt;"> = 1L;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in; text-indent: 0.5in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">private</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">static</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">final</span></b><span style="font-family: Consolas; font-size: 9pt;"> Pattern </span><i><span style="color: #0000c0; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">SPACE</span></i><span style="font-family: Consolas; font-size: 9pt;"> = Pattern.<i>compile</i>(</span><span style="color: #2a00ff; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">" "</span><span style="font-family: Consolas; font-size: 9pt;">);</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in; text-indent: 0.5in;">
<b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">static</span></b><span style="font-family: Consolas; font-size: 9pt;"> String </span><i><span style="color: #0000c0; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">fileName</span></i><span style="font-family: Consolas; font-size: 9pt;"> = StringUtils.</span><i><span style="color: #0000c0; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">EMPTY</span></i><span style="font-family: Consolas; font-size: 9pt;">;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">public</span></b><span style="font-family: Consolas; font-size: 9pt;"> Object runJob(JavaSparkContext jsc,
Config config) {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">try</span></b><span style="font-family: Consolas; font-size: 9pt;"> {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> JavaRDD<String>
lines = jsc.textFile(</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> config.getString(</span><span style="color: #2a00ff; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">"input.filename"</span><span style="font-family: Consolas; font-size: 9pt;">), 1);</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> JavaRDD<String>
words = lines</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> .flatMap(</span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">new</span></b><span style="font-family: Consolas; font-size: 9pt;"> FlatMapFunction<String,
String>() {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">public</span></b><span style="font-family: Consolas; font-size: 9pt;"> Iterable<String> call(String s)
{</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> Arrays.<i>asList</i>(</span><i><span style="color: #0000c0; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">SPACE</span></i><span style="font-family: Consolas; font-size: 9pt;">.split(s));</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> });</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> JavaPairRDD<String,
Integer> counts = words.mapToPair(</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">new</span></b><span style="font-family: Consolas; font-size: 9pt;"> PairFunction<String, String,
Integer>() {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">public</span></b><span style="font-family: Consolas; font-size: 9pt;"> Tuple2<String, Integer>
call(String s) {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">new</span></b><span style="font-family: Consolas; font-size: 9pt;"> Tuple2<String, Integer>(s, 1);</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }).reduceByKey(</span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">new</span></b><span style="font-family: Consolas; font-size: 9pt;"> <u>Function2<Integer, Integer,
Integer>()</u> {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">public</span></b><span style="font-family: Consolas; font-size: 9pt;"> Integer call(Integer i1, Integer i2)
{</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> i1 + i2;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> });</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> List<Tuple2<String,
Integer>> output = counts.collect();</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> System.</span><i><span style="color: #0000c0; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">out</span></i><span style="font-family: Consolas; font-size: 9pt;">.println(output);</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> output;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> } </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">catch</span></b><span style="font-family: Consolas; font-size: 9pt;"> (Exception e) {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> e.printStackTrace();</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">null</span></b><span style="font-family: Consolas; font-size: 9pt;">;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">public</span></b><span style="font-family: Consolas; font-size: 9pt;"> SparkJobValidation
validate(SparkContext sc, Config config) {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> String filename
= config.getString(</span><span style="color: #2a00ff; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">"input.filename"</span><span style="font-family: Consolas; font-size: 9pt;">);</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">if</span></b><span style="font-family: Consolas; font-size: 9pt;"> (!filename.isEmpty()) {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> SparkJobValid$.</span><i><span style="color: #0000c0; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">MODULE$</span></i><span style="font-family: Consolas; font-size: 9pt;">;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> } </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">else</span></b><span style="font-family: Consolas; font-size: 9pt;"> {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">new</span></b><span style="font-family: Consolas; font-size: 9pt;"> SparkJobInvalid(</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><span style="color: #2a00ff; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">"Input paramerter is missing. Please mention the
filename"</span><span style="font-family: Consolas; font-size: 9pt;">);</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> }</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">public</span></b><span style="font-family: Consolas; font-size: 9pt;"> String invalidate(JavaSparkContext
jsc, Config config) {</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">return</span></b><span style="font-family: Consolas; font-size: 9pt;"> </span><b><span style="color: #7f0055; font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;">null</span></b><span style="font-family: Consolas; font-size: 9pt;">;</span><span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 9.0pt; mso-bidi-font-size: 10.0pt;"> </span><span style="font-family: Consolas; font-size: 9pt;"> }</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; padding: 0in;">
<span style="font-family: Consolas; font-size: 9pt; line-height: 115%;">}</span><span style="font-family: Consolas; font-size: 12.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"><o:p></o:p></span></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Next step is : compile the code and build the jar. Then upload it to the Job server.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
So your Spark job is ready to run on JobServer....!!! </div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com19tag:blogger.com,1999:blog-7459259550976670934.post-61382080888160649752016-05-26T23:22:00.001+05:302016-05-26T23:22:58.662+05:30How to run Spark Job server and spark jobs<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<div class="MsoNormal">
<div class="MsoNormal">
<div class="MsoNormal">
<a href="https://github.com/spark-jobserver/spark-jobserver">Spark
Job server</a> provides a RESTful interface for submission and management of
Spark jobs, jars and job contexts. It facilitates sharing of jobs and RDD data
in a single context. It can run standalone job as well. Job History and configuration is persisted.<br />
<br /></div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal" style="text-align: justify;">
<div class="MsoNormal">
<b><u>Features:<o:p></o:p></u></b></div>
<div class="MsoNormal">
Few of the features are listed here:<br /><span style="font-family: Symbol; text-indent: -0.25in;"> ·<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Simple REST interface</span><br /><span style="font-family: Symbol; text-indent: -0.25in;"> ·<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Separate JVM per SparkContext for isolation</span><br /><span style="font-family: Symbol; text-indent: -0.25in;"> ·<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Separate Jar uploading step for faster job
execution</span><br /><span style="font-family: Symbol; text-indent: -0.25in;"> ·<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Supports low-latency jobs via long running job
contexts</span><br /><span style="font-family: Symbol; text-indent: -0.25in;"> ·<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Asynchronous and synchronous job API.</span><br /><span style="font-family: Symbol; text-indent: -0.25in;"> ·<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Kill running job via stop context and delete
job.</span><br /><span style="font-family: Symbol; text-indent: -0.25in;"> · <span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Named Objects (RDD/Dataframes) to cache and
retrieve by name, improving object sharing and reuse among jobs.</span><br /><span style="font-family: Symbol; text-indent: -0.25in;"> ·<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Preliminary support for Java.</span></div>
<div class="MsoNormal">
<span style="text-indent: -0.25in;"><br /></span></div>
</div>
<div class="MsoNormal">
<span style="font-size: large;"><b><u>Setup Spark Job Server:</u></b></span><br />
<b><br /></b></div>
<div class="MsoNormal">
<table align="left" border="0" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="border-collapse: collapse; margin-left: 6.75pt; margin-right: 6.75pt; mso-padding-alt: 0in 0in 0in 0in; mso-table-anchor-horizontal: column; mso-table-anchor-vertical: paragraph; mso-table-left: left; mso-table-lspace: 9.0pt; mso-table-rspace: 9.0pt; mso-yfti-tbllook: 1184; width: 279px;">
<thead>
<tr style="height: 19.2pt; mso-yfti-firstrow: yes; mso-yfti-irow: 0;">
<td style="background: white; border: solid #DDDDDD 1.0pt; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div align="center" class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly; text-align: center;">
<b><span style="font-family: "times new roman" , serif; font-size: 12.0pt;">Sbt Version</span></b><span style="font-family: "times new roman" , serif; font-size: 12.0pt;"><o:p></o:p></span></div>
</td>
<td style="background: white; border-left: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-left-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: "times new roman" , serif; font-size: 12.0pt;">Spark
version</span></b><span style="font-family: "times new roman" , serif; font-size: 12.0pt;"><o:p></o:p></span></div>
</td>
</tr>
</thead>
<tbody>
<tr style="height: 19.2pt; mso-yfti-irow: 1;">
<td style="background: white; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.3.1<o:p></o:p></span></div>
</td>
<td style="background: white; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.9.1<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 2;">
<td style="background: #F8F8F8; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.4.0<o:p></o:p></span></div>
</td>
<td style="background: #F8F8F8; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.0.2<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 3;">
<td style="background: white; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.4.1<o:p></o:p></span></div>
</td>
<td style="background: white; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.1.0<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 4;">
<td style="background: #F8F8F8; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.5.0
<o:p></o:p></span></div>
</td>
<td style="background: #F8F8F8; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.2.0<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 5;">
<td style="background: white; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.5.1<o:p></o:p></span></div>
</td>
<td style="background: white; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.3.0<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 6;">
<td style="background: #F8F8F8; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.5.2<o:p></o:p></span></div>
</td>
<td style="background: #F8F8F8; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.3.1 <o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 7;">
<td style="background: white; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.6.0<o:p></o:p></span></div>
</td>
<td style="background: white; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.4.1<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 8;">
<td style="background: #F8F8F8; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.6.1<o:p></o:p></span></div>
</td>
<td style="background: #F8F8F8; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.5.2 <o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 9;">
<td style="background: white; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">0.6.2<o:p></o:p></span></div>
</td>
<td style="background: white; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.6.1<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.2pt; mso-yfti-irow: 10; mso-yfti-lastrow: yes;">
<td style="background: #F8F8F8; border-top: none; border: solid #DDDDDD 1.0pt; height: 19.2pt; mso-border-top-alt: solid #DDDDDD 1.0pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 96.9pt;" width="129"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">Master(0.13.x)<o:p></o:p></span></div>
</td>
<td style="background: #F8F8F8; border-bottom: solid #DDDDDD 1.0pt; border-left: none; border-right: solid #DDDDDD 1.0pt; border-top: none; height: 19.2pt; padding: 4.5pt 9.75pt 4.5pt 9.75pt; width: 112.3pt;" width="150"><div class="MsoNormal" style="line-height: 19.2pt; margin-bottom: .0001pt; margin-bottom: 0in; mso-element-anchor-horizontal: column; mso-element-anchor-vertical: paragraph; mso-element-frame-hspace: 9.0pt; mso-element-wrap: around; mso-element: frame; mso-height-rule: exactly;">
<span style="font-family: "times new roman" , serif; font-size: 12.0pt;">1.6.1 <span style="font-size: 12pt;"><o:p></o:p></span></span></div>
</td>
</tr>
</tbody></table>
</div>
<br />
<div class="MsoNormal">
<b><u>Pre-requisites:</u></b><br />
To setup the server, pre-requisites are:<br />
<br />
<div class="MsoListParagraphCxSpFirst" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<div style="text-align: left;">
<span style="font-family: "symbol"; text-indent: -0.25in;"> ·<span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">64 bit Operating system</span></div>
<span style="font-family: "symbol"; text-indent: -0.25in;"> ·<span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Java 8</span><br />
<span style="font-family: "symbol"; text-indent: -0.25in;"> ·<span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">sbt</span><br />
<span style="font-family: "symbol"; text-indent: -0.25in;"> ·<span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">curl</span><br />
<span style="font-family: "symbol"; text-indent: -0.25in;"> ·<span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">git</span><br />
<span style="font-family: "symbol"; text-indent: -0.25in;"> ·<span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Spark</span><o:p></o:p></div>
<div class="MsoListParagraphCxSpMiddle" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<o:p></o:p></div>
<div class="MsoListParagraphCxSpMiddle" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<o:p></o:p></div>
<div class="MsoListParagraphCxSpMiddle" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<o:p></o:p></div>
<div class="MsoListParagraphCxSpMiddle" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<o:p></o:p></div>
<div class="MsoListParagraphCxSpLast" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<o:p></o:p></div>
<div class="MsoNormal">
<br />
Please make sure sbt version should be compatible to spark version. Here is the list of compatible versions.</div>
<span style="text-align: justify;"><br /></span>
<span style="text-align: justify;">You can install Java8 from </span><a href="http://www.oracle.com/technetwork/java/javase/downloads/jdk8-downloads-2133151.html" style="text-align: justify;">here</a><span style="text-align: justify;">.</span></div>
<div class="MsoNormal" style="text-align: justify;">
<o:p></o:p></div>
<div class="MsoNormal">
<br />
For sbt, you can refer <a href="http://www.scala-sbt.org/release/docs/Setup.html">sbt official site</a> .<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br />
<br />
<br />
<br />
<br />
<br />
<br />
<b style="text-indent: -4.5pt;"> For CentOS users:</b></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 0in 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 4.0pt 1.0pt 4.0pt; padding: 0in;">
Yum
install curl<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 4.0pt 1.0pt 4.0pt; padding: 0in;">
Yum
install git<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 4.0pt 1.0pt 4.0pt; padding: 0in;">
Yum
install sbt<o:p></o:p></div>
</div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<br /></div>
<div class="MsoNormal" style="margin-left: -4.5pt;">
<b> For Ubuntu users:<o:p></o:p></b></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 0in 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 4.0pt 1.0pt 4.0pt; padding: 0in;">
sudo
apt-get install curl<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 4.0pt 1.0pt 4.0pt; padding: 0in;">
sudo
apt-get install git<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 4.0pt 1.0pt 4.0pt; padding: 0in;">
sudo
apt-get install sbt<o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Download the <a href="http://spark.apache.org/downloads.html">Spark</a>
package and setup it. Windows User can refer the link <a href="http://nishutayaltech.blogspot.in/2015/04/how-to-run-apache-spark-on-windows7-in.html">How
to setup Spark on windows</a>. Once spark setup is done, run the spark master and worker
daemon.</div>
<div class="MsoNormal">
<o:p></o:p></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
[xuser@machine123
spark-1.6.1-bin-hadoop2.6]$ sbin/start-all.sh<o:p></o:p></div>
</div>
<div class="MsoNormal">
<br />
Now clone the <a href="https://github.com/spark-jobserver/spark-jobserver">spark job server repo</a>
on your local.<br />
<br /></div>
<div class="MsoNormal">
<o:p></o:p></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
[xuser@machine123 ~]$ git
clone https://github.com/spark-jobserver/spark-jobserver.git<o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Run sbt command in the cloned repo. It will build the project and give the sbt
shell. If you are running <b>sbt</b>
command first time,it will take much time. Then type <span style="background: rgb(217, 217, 217);">re-start </span> to start the server on sbt shell:<br />
<br /></div>
<div class="MsoNormal">
<o:p></o:p></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<b>[xuser@machine123
spark-jobserver]$ sbt</b><code><span style="border: none 1.0pt; color: #333333; font-family: "cambria math" , serif; font-size: 10.0pt; line-height: 107%; padding: 0in;">⏎</span></code><o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
[info]
Loading project definition from /home/xuser/softwares/spark-jobserver/project<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
Missing
bintray credentials /home/xuser/.bintray/.credentials. Some bintray features
depend on this.<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
Missing
bintray credentials /home/xuser/.bintray/.credentials. Some bintray features
depend on this.<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
Missing
bintray credentials /home/xuser/.bintray/.credentials. Some bintray features
depend on this.<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
Missing
bintray credentials /home/xuser/.bintray/.credentials. Some bintray features
depend on this.<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
[info]
Set current project to root (in build file:/home/xuser/spark-jobserver/)<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<b>>
re-start</b><o:p></o:p></div>
</div>
<div class="MsoNormal">
<br />
If you want to use any specific configuration to start the server, You can also specify JVM parameters after "---".
Including all the options looks like this:<br />
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 0in;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 0in; padding: 0in;">
>
re-start config/application.conf --- -Xmx512m<o:p></o:p></div>
</div>
</div>
<div class="MsoNormal">
<br />
It will start the spark job server on <a href="http://localhost:8090/">http://localhost:8090</a> url. You can see all daemons using jps.<br />
<br />
<br /></div>
<div class="MsoNormal">
<div class="MsoNormal">
<div class="MsoNormal">
<b><span style="font-size: large;"><u>Sample SparkJobs Walkthrough:</u></span></b><br />
<br />
Spark-job-server has some sample Spark jobs written in Scala.To package the test jar, run command. </div>
<div style="-webkit-text-stroke-width: 0px; border: 1pt solid; color: black; font-family: 'Times New Roman'; font-size: medium; font-style: normal; font-variant: normal; font-weight: normal; letter-spacing: normal; line-height: normal; orphans: auto; padding: 1pt 4pt; text-align: left; text-indent: 0px; text-transform: none; white-space: normal; widows: 1; word-spacing: 0px;">
<div class="MsoNormal" style="border: none; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<div style="margin: 0px;">
[xuser@machine123 spark-jobserver]$ sbt job-server-tests/package<o:p></o:p></div>
</div>
</div>
<br /></div>
<div class="MsoNormal">
<span style="font-family: "calibri" , sans-serif; font-size: 11pt; line-height: 107%;">It
will give you a jar in </span><b style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 107%;">job-server-tests/target/scala-2.10</b><span style="font-family: "calibri" , sans-serif; font-size: 11pt; line-height: 107%;">
directory.</span>Now upload
the jar to the server:<o:p></o:p><br />
<br /></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
[xuser@machine123
spark-jobserver]$ curl --data-binary
@job-server-tests/target/scala-2.10/job-server-tests_2.10-0.7.0-SNAPSHOT.jar
localhost:8090/jars/test<code><span style="border: none 1.0pt; color: #333333; font-family: "cambria math" , serif; font-size: 10.0pt; line-height: 107%; padding: 0in;">⏎</span></code><o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
OK<b><o:p></o:p></b></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
This jar is uploaded as app <b>test. </b>You can view same information on webUI.<b><o:p></o:p></b></div>
<div class="MsoNormal">
We can run the jobs in two mode: Transient Context mode, Persistent Content mode.<br />
<br /><o:p></o:p></div>
<div class="MsoNormal">
<b><span style="font-size: large;"><u>Unrelated jobs -with Transient
Context:</u></span><o:p></o:p></b><br />
<br />
In this mode, each job will create its own spark context. Let's submit the WordCount job on the server:</div>
<div class="MsoNormal">
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<b>[xuser@machine123
~]$ curl -d "input.string = a b c a b see"
'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCountExample'<span style="font-family: "cambria math" , serif; mso-bidi-font-family: "Cambria Math";">⏎</span></b><o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
{<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"status": "STARTED",<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"result": {<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"jobId":
"5453779a-f004-45fc-a11d-a39dae0f9bf4",<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"context":
"b7ea0eb5-spark.jobserver.WordCountExample"<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
}<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
}<span style="font-family: "cambria math" , serif; mso-bidi-font-family: "Cambria Math";">⏎</span><o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><u><span style="font-size: large;">Persistent Context mode-
Related Jobs:</span><o:p></o:p></u></b></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
In this mode, jobs can use the existing Spark context. Create a spark context named ‘test-context’:</div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
[xuser@machine123
~]$ curl -d ""
'localhost:8090/contexts/test-context?num-cpu-cores=4&memory-per-node=512m'<span style="font-family: "cambria math" , serif; mso-bidi-font-family: "Cambria Math";">⏎</span><o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
OK<span style="font-family: "cambria math" , serif; mso-bidi-font-family: "Cambria Math";">⏎</span><o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b>To see the existing contexts:</b><o:p></o:p></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
[xuser@machine123
~]$ curl localhost:8090/contexts<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
["test-context"]<span style="font-family: "cambria math" , serif; mso-bidi-font-family: "Cambria Math";">⏎</span><o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b>To run the job in existing context:</b><o:p></o:p></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<b>[xuser@machine123
~]$ curl -d "input.string = a b c a b see"
'localhost:8090/jobs?appName=test&classPath=spark.jobserver.WordCountExample&context=test-context&sync=true'<span style="font-family: "cambria math" , serif; mso-bidi-font-family: "Cambria Math";">⏎</span><o:p></o:p></b></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
{<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"result": {<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"a": 2,<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"b": 2,<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"c": 1,<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"see": 1<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
}<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
}<o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
You can run the job without any input argument passing -d "":</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<o:p></o:p></div>
<div style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<b>[xuser@machine123
~]$ curl -d ""
'localhost:8090/jobs?appName=test&classPath=spark.jobserver.LongPiJob&context=test-context&sync=true'
<span style="font-family: "cambria math" , serif; mso-bidi-font-family: "Cambria Math";">⏎</span></b><o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
{<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
"result": 3.1403460207612457<o:p></o:p></div>
<div class="MsoNormal" style="border: none; margin-bottom: .0001pt; margin-bottom: 0in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
}<o:p></o:p></div>
</div>
<br />
<div class="MsoNormal">
You can check the job status by giving job ID in following command:</div>
<div class="MsoNormal">
<br /></div>
<div style="border: 1pt solid windowtext; padding: 1pt 4pt;">
<div class="MsoNormal" style="border: none; margin-bottom: 0.0001pt; padding: 0in;">
[xuser@machine123 ~]$ curl localhost:8090/jobs/<jobID><o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<span style="font-family: "calibri" , sans-serif; font-size: 11pt; line-height: 107%;">You can see the all the running, completed,
failed jobs on Job Server UI. Now you are ready to write your jobs to run of SparkJobServer..!!!</span></div>
</div>
</div>
</div>
</div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com31tag:blogger.com,1999:blog-7459259550976670934.post-60424113661256069552016-04-15T18:17:00.002+05:302016-04-15T18:17:46.160+05:30Frequent Issues occurred during Spark Development<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
While coding,we face many issues,be it compilation or execution. So I tried to collate some frequently faced
issues for Spark development here.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoListParagraph">
</div>
<ul style="text-align: left;">
<li><b> When we run
spark on windows, sometimes following error is displayed:</b></li>
</ul>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">Caused by: java.lang.RuntimeException: The
root scratch dir: /tmp/hive on HDFS should be writable. Current permissions
are: rwxrwxr-x<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">at
org.apache.hadoop.hive.ql.session.SessionState.createRootHDFSDir(SessionState.java:529)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">at
org.apache.hadoop.hive.ql.session.SessionState.createSessionDirs(SessionState.java:478)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">at
org.apache.hadoop.hive.ql.session.SessionState.start(SessionState.java:430)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">... 7 more<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<br /></div>
<div class="MsoListParagraphCxSpFirst">
<b> Solution:<o:p></o:p></b></div>
<div style="border: dashed windowtext 1.0pt; margin-left: .5in; margin-right: 0in; mso-border-alt: dot-dot-dash windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 1.0pt 1.0pt 1.0pt;">
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 1.0pt 1.0pt 1.0pt; padding: 0in;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">You need to give 777
permission to this directory. </span><br />
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Lets say, if /tmp/hive is present in your D: drive, r</span><span style="font-size: 10pt; line-height: 107%;">un following command:</span><br />
<span style="font-size: 10pt; line-height: 107%;"><br /></span></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 1.0pt 1.0pt 1.0pt; padding: 0in;">
<span style="font-size: 10pt; line-height: 107%;"><span style="color: #3d85c6;">D:\winutils\bin\winutils.exe chmod 777
D:\tmp\hive</span><span style="color: #0070c0;"><o:p></o:p></span></span></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 1.0pt 1.0pt 1.0pt; padding: 0in;">
<span style="font-size: 10pt; line-height: 107%;">For complete installation steps, you can refer previous </span><a href="http://nishutayaltech.blogspot.in/2015/04/how-to-run-apache-spark-on-windows7-in.html" style="font-size: 10pt; line-height: 107%;">post</a><span style="font-size: 10pt; line-height: 107%;">.</span></div>
</div>
<div class="MsoListParagraphCxSpMiddle">
<br />
<br /></div>
<div class="MsoListParagraphCxSpMiddle">
</div>
<ul>
<li><b> How to launch Master and worker on windows manually?</b></li>
</ul>
<div class="MsoListParagraphCxSpFirst">
<b> Solution:<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle">
<o:p></o:p></div>
<div style="border: dashed windowtext 1.0pt; margin-left: .5in; margin-right: 0in; mso-border-alt: dot-dot-dash windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Open command prompt and go to %SPARK_HOME%/bin folder. Run the following commands:<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<br /></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="color: #0070c0; font-size: 10.0pt; line-height: 107%;">spark-class
org.apache.spark.deploy.master.Master</span><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"> <span style="font-family: "wingdings";"><=</span></span><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"> for master
node<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpLast" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="font-size: 10.0pt; line-height: 107%;"><span style="color: #0070c0;">spark-class
org.apache.spark.deploy.worker.Worker spark://masternode:7077 </span><span style="font-family: "wingdings";"><=</span></span><span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"> for worker node<o:p></o:p></span></div>
</div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-indent: 0.5in;">
<br />
<br /></div>
<ul>
<li><b> How to
get rid of “<span style="color: red;">A master url is not set for configuration</span>”
error?</b></li>
</ul>
<div class="MsoListParagraphCxSpFirst">
<b> Solution:<o:p></o:p></b></div>
<div style="border: dashed windowtext 1.0pt; margin-left: .5in; margin-right: 0in; mso-border-alt: dot-dot-dash windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">From command line:<b><o:p></o:p></b></span><br />
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Set <span style="color: #0070c0;">–Dspark.master=spark://hostname:7077
</span>as a JVM parameter<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<br /></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">From code, use SparkConf.setMaster() method.<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="color: #0070c0; font-size: 10.0pt; line-height: 107%;">SparkConf conf = new SparkConf().setAppName("App_Name")</span><span style="color: #0070c0; font-size: 10pt; line-height: 107%;">.setMaster("spark://hostname:7077);</span></div>
</div>
<div class="MsoListParagraphCxSpMiddle">
<br />
<br /></div>
<div class="MsoListParagraphCxSpLast">
</div>
<ul style="text-align: left;">
<li><b> How to
solve following “System memory, Please use larger heap” size error?</b></li>
</ul>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">Exception in thread "main"
java.lang.IllegalArgumentException: System memory 259522560 must be at least
4.718592E8. Please use a larger heap size.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">at<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">org.apache.spark.memory.UnifiedMemoryManager$.getMaxMemory(UnifiedMemoryManager.scala:193)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">at org.apache.spark.memory.UnifiedMemoryManager$.apply(UnifiedMemoryManager.scala:175)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;">at
org.apache.spark.SparkEnv$.create(SparkEnv.scala:354)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;"> at
org.apache.spark.SparkEnv$.createDriverEnv(SparkEnv.scala:193)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;"> at org.apache.spark.SparkContext.createSparkEnv(SparkContext.scala:288)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;"> at
org.apache.spark.SparkContext.<init>(SparkContext.scala:457)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;"> at
org.apache.spark.api.java.JavaSparkContext.<init>(JavaSparkContext.scala:59)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;"> at
com.spark.example.SimpleApp.main(SimpleApp.java:18)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="color: red; font-family: "courier new"; font-size: 8.0pt;"><br /></span></div>
<div class="MsoListParagraphCxSpFirst">
<b> </b><b> Solution:</b></div>
<div style="border: dashed windowtext 1.0pt; margin-left: .5in; margin-right: 0in; mso-border-alt: dot-dot-dash windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoListParagraphCxSpMiddle" style="border: none; margin-left: 0in; mso-add-space: auto; mso-border-alt: dot-dot-dash windowtext .5pt; mso-padding-alt: 1.0pt 4.0pt 1.0pt 4.0pt; padding: 0in;">
<span style="font-size: 10.0pt; line-height: 107%; mso-bidi-font-size: 11.0pt;">Add <span style="color: #0070c0;">-Xmx1024m -Xms512m </span>in
VM arguments<o:p></o:p></span></div>
</div>
<div class="MsoNormal">
</div>
<div class="MsoListParagraphCxSpLast">
<br />
<br />
Stay tuned for further updates..!!! </div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com7tag:blogger.com,1999:blog-7459259550976670934.post-54278264117513627712015-09-02T23:45:00.001+05:302015-09-02T23:45:40.203+05:30Co-reference Resolution in Stanford CoreNLP<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: Arial, sans-serif; font-size: 10pt; line-height: 115%;">In the previous
blog, we discussed about Dependency parsing. Now we will discuss about how to
identify the expressions or entities which refer to the same person/thing or
any object. This problem is solved using Co-reference resolution concept.</span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"><b><a href="https://en.wikipedia.org/wiki/Coreference">Co-reference</a> resolution(or anaphora resolution)</b> is
the task of finding all the expressions that refers to the same entity in
multiple sentences.</span></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;">Example : </span><b><span style="color: red; font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;">James</span><span style="font-family: Arial, sans-serif; font-size: 10pt; line-height: 115%;">
told that <span style="color: red;">he</span>
would go out for dinner.</span></b></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;">Here you can see that ‘<b>James’</b> and ‘<b>he’</b>, both are referring to the same person.<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;">Co-reference resolution is
an important step in Natural language processing i.e. Information retrieval,
Question answering etc.<o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;">Now we’ll see how we can
implement it using Stanford CoreNLP package in java.<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"><br /></span></div>
<div style="background: #DBE5F1; border: solid windowtext 1.0pt; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 1.0pt 1.0pt 1.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> public class CoRefExample {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> public
static void main(String[] args) throws IOException {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> Properties
props = new Properties();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> props.put("annotators",
"tokenize, ssplit, pos, lemma,
ner, parse, dcoref");<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> StanfordCoreNLP pipeline = new
StanfordCoreNLP(props);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
read some text in the text variable<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> String
text = "The Revolutionary War occurred in the 1700s .it was the first war
in the US states";<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> <o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
create an empty Annotation just with the given text<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> Annotation
document = new Annotation(text);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
run all Annotators on this text<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> pipeline.annotate(document);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
This is the coreference link graph<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
Each chain stores a set of mentions that link to each other,<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
along with a method for getting the most representative mention<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
Both sentence and token offsets start at 1!<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> Map<Integer,
CorefChain> graph = document.get(CorefChainAnnotation.class);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> for
(Map.Entry<Integer, CorefChain> entry : graph.entrySet()) {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> CorefChain
c = entry.getValue();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> <o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
this is because it prints out a lot of self references which aren't that useful<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> <o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> CorefMention
cm = c.getRepresentativeMention();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> String
clust = "";<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> List<CoreLabel>
tks = document.get(SentencesAnnotation.class)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> .get(cm.sentNum
- 1).get(TokensAnnotation.class);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> <o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> for
(int i = cm.startIndex - 1; i < cm.endIndex - 1; i++)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> clust
+= tks.get(i).get(TextAnnotation.class) + " ";<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> clust
= clust.trim();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> System.out.println("representative
mention: \"" + clust<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> +
"\" is mentioned by:");<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> Iterable<Set<CorefMention>>
cSet = c.getMentionMap().values();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> <o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> CorefMention
m = c.getRepresentativeMention();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> String
clust2 = "";<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> tks
= document.get(SentencesAnnotation.class).get(m.sentNum - 1)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> .get(TokensAnnotation.class);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> for
(int i = m.startIndex - 1; i < m.endIndex - 1; i++)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> clust2
+= tks.get(i).get(TextAnnotation.class) + " ";<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> clust2
= clust2.trim();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> //
don't need the self mention<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> if
(clust.equals(clust2))<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> continue;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> System.out.println("\t"
+ clust2);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 9.0pt; line-height: 115%;"> }<o:p></o:p></span></div>
</div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;">Once you execute the above code, you will get “Revolutionary
War” and “It” as same entities.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<span style="font-family: "Arial","sans-serif"; font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;">Now it’s your turn to try it out. </span><span style="font-family: Arial, sans-serif; font-size: 10pt; line-height: 115%;">You can find the full code on </span><a href="https://github.com/nishutayal/StanfordDependencyMaster/blob/master/src/main/java/com/example/dependency/CoRefExample.java" style="font-family: Arial, sans-serif; font-size: 10pt; line-height: 115%;">github</a><span style="font-family: Arial, sans-serif; font-size: 10pt; line-height: 115%;">
.</span></div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com3tag:blogger.com,1999:blog-7459259550976670934.post-82014800412779009342015-08-30T02:10:00.000+05:302015-08-30T02:10:33.219+05:30Dependency Parsing in Stanford CoreNLP<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
If you are working on Natural language Processing, this post
will be useful for triplet Extraction from the
documents.</div>
<div class="MsoNormal">
Here we assume, you have basic knowledge about <a href="https://en.wikipedia.org/wiki/Part-of-speech_tagging">Part-of-Speech
tagging</a>, tokens etc. concepts. Let’s
discuss about Dependency Parsing first.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-size: large;">Stanford Dependency
Parsing:</span><o:p></o:p></b></div>
<div class="MsoNormal">
Stanford dependencies provide a representation of
grammatical relations between words in a sentence. These dependencies are
triplets : Name of the relation, governor and dependent.</div>
<div class="MsoNormal">
Here is an example sentence :</div>
<div class="MsoNormal">
“<b>Bell,based in Los Angeles, makes and distributes
electronic, computer and building products.</b>”</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
We can see that “the
subject for verb ‘distributes’ is <b>Bell</b>.”
For the above sentence, Stanford dependencies(SD) representation is :</div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgE7fqOJOgVnJVc-9ZHpq6RhJLsS6yFvU4zWrO-N3PsJpga-s61gNA1Qo3ob6fA1vTGwiE4mJQ-LHEZ0dW9LRKwrVu62FbzPguPLOtdNIJPQf5Kch29O3gxzGsC_7-EU6iIdZ79XciK6wSI/s1600/dependency.png" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"><img border="0" height="320" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgE7fqOJOgVnJVc-9ZHpq6RhJLsS6yFvU4zWrO-N3PsJpga-s61gNA1Qo3ob6fA1vTGwiE4mJQ-LHEZ0dW9LRKwrVu62FbzPguPLOtdNIJPQf5Kch29O3gxzGsC_7-EU6iIdZ79XciK6wSI/s320/dependency.png" width="310" /></a></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="text-align: justify;">
<span style="color: #2e74b5;"> nsubj(makes-8, Bell-1)</span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> nsubj(distributes-10, Bell-1)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> vmod(Bell-1, based-3)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> nn(Angeles-6, Los-5)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> prep_in(based-3, Angeles-6)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> root(ROOT-0, makes-8)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> conj_and(makes-8, distributes-10)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> amod(products-16, electronic-11)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> conj_and(electronic-11, computer-13)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> amod(products-16, computer-13)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> conj_and(electronic-11, building-15)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5;"> amod(products-16, building-15)</span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt; text-align: justify;">
<span style="color: #2e74b5; text-align: left;"> dobj(makes-8, products-16)</span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
In above representation, first term is dependency tag, which
represents the relation between governor(2nd term) and dependent(3rd term) .</div>
<div class="MsoNormal">
There are various dependency tags, which are listed in the <a href="http://nlp.stanford.edu/software/dependencies_manual.pdf">Stanford Dependency
manual</a>.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Following are two type of dependencies :</div>
<div class="MsoNormal">
</div>
<ul style="text-align: left;">
<li><span style="font-family: Symbol; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><b style="text-indent: -0.25in;">Basic/Non
Collapased</b><span style="text-indent: -0.25in;">: This representation gives the basic dependencies as well as the
extra ones (which break the tree structure), without any collapsing or
propagation of </span><span style="text-indent: -0.25in;">conjuncts. Eg.</span>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: .0001pt; margin-bottom: 0in; mso-add-space: auto;">
<span style="color: #3d85c6;"> prep(based-7,
in-8) <o:p></o:p></span></div>
<div class="MsoListParagraphCxSpLast" style="margin-bottom: .0001pt; margin-bottom: 0in; mso-add-space: auto;">
<span style="color: #3d85c6;"> pobj(in-8, LA-9) </span><b><o:p></o:p></b></div>
</li>
</ul>
<div class="MsoListParagraphCxSpMiddle">
</div>
<ul style="text-align: left;">
<li><span style="font-family: Symbol; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><b style="text-indent: -0.25in;">Collapased</b><span style="text-indent: -0.25in;">
: In the collapsed representation, dependencies involving prepositions,
conjuncts, as well as information about the referent of relative clauses are
collapsed to get direct dependencies between content words. For instance, the
dependencies involving the preposition “in” in the above example will be
collapsed into one single r</span><span style="text-indent: -0.25in;">elation:</span>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: .0001pt; margin-bottom: 0in; mso-add-space: auto;">
<span style="color: #3d85c6;"> prep(based-7,
in-8) <o:p></o:p></span></div>
<div class="MsoListParagraphCxSpLast" style="margin-bottom: .0001pt; margin-bottom: 0in; mso-add-space: auto;">
<span style="color: #3d85c6;"> pobj(in-8, LA-9) </span><b><o:p></o:p></b></div>
</li>
</ul>
<div class="MsoListParagraphCxSpLast">
will become :
<span style="color: #3d85c6;">prep_in(based-7, LA-9)</span><b><o:p></o:p></b></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Now we’ll see, how can we get these using JAVA Code.</div>
<div class="MsoNormal">
<br /></div>
<div style="background: #E8EEF8; border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;">import java.util.*;</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;">import edu.stanford.nlp.ling.*;</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;">import edu.stanford.nlp.trees.*;</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;">import
edu.stanford.nlp.parser.lexparser.LexicalizedParser;</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;">class ParserDemo {</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> public
static void main(String[] args) {</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> LexicalizedParser
lp = LexicalizedParser</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> .loadModel("edu/stanford/nlp/models/lexparser/englishPCFG.ser.gz");</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> lp.setOptionFlags(new
String[] { "-maxLength", "80",</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> "-retainTmpSubcategories"
});</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> String[]
sent = { "This", "is", "an", "easy",
"sentence", "." };</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> List<CoreLabel>
rawWords = Sentence.toCoreLabelList(sent);</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> Tree
parse = lp.apply(rawWords);</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> parse.pennPrint();</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> System.out.println();</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> </span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> TreebankLanguagePack
tlp = new PennTreebankLanguagePack();</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> GrammaticalStructureFactory
gsf = tlp.grammaticalStructureFactory();</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> GrammaticalStructure
gs = gsf.newGrammaticalStructure(parse);</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> List<TypedDependency>
tdl = gs.typedDependenciesCCprocessed();</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> System.out.println(tdl);</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> TreePrint
tp = new TreePrint("penn,typedDependenciesCollapsed");</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> tp.printTree(parse);</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;"> }</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Georgia, Times New Roman, serif;">}</span></div>
</div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<a href="https://www.blogger.com/null" name="_GoBack"></a><b> </b>Now you can easily extract the triplets from document. You can find the example code in <a href="https://github.com/nishutayal/StanfordDependencyMaster" target="_blank">github repo</a>.</div>
<br />
<div class="MsoNormal">
<br /></div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com6tag:blogger.com,1999:blog-7459259550976670934.post-21845195668711907992015-07-02T12:00:00.000+05:302015-08-04T13:47:43.280+05:30Writing a custom NameFinder model in OpenNLP<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Though we have various NER models available in OpenNLP, but
entity extraction doesn’t end here with the existing one only. We may need to
find the entities based on Clinical, Biological, Sports, Banking domain etc. <o:p></o:p></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">So should we restrict ourselves with the models already
provided? - No, <o:p></o:p></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">We can build our own Name Finder model. Steps required doing this are: Get the sample
training dataset, build the model and test it.<span style="font-size: 14pt;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;"><br /></span></span></div>
<div class="MsoNormal">
<b><span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">What type
of data should we have for training a model:<span style="font-size: 14pt;"><o:p></o:p></span></span></span></b></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Sentences should be separated with new line character (\n). Values should be separated from <Start>
and <END> tags with a space character. <span style="font-size: 14pt;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;"><br /></span></span></div>
<div style="background: #C6D9F1; border: dashed windowtext 1.0pt; mso-background-themecolor: text2; mso-background-themetint: 51; mso-border-alt: dashed windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10.0pt; line-height: 115%; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><START:medicine> Augmentin-Duo <END></span><span style="font-size: 10.0pt; line-height: 115%; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman";">
is a
penicillin
antibiotic that contains two medicines -
<START:medicine> amoxicillin
trihydrate
<END> and <START:medicine> potassium clavulanate <END>.<span style="letter-spacing: -.15pt;"> </span>They work together
to kill
certain types of bacteria and
are used to
treat certain<span style="letter-spacing: -.05pt;"> </span>types
of bacterial infections.</span><span style="font-size: 18.0pt; line-height: 115%; mso-bidi-font-family: Calibri; mso-bidi-font-size: 11.0pt; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
</div>
<div class="MsoNormal">
<span style="font-size: 14.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"><span style="font-family: Times, Times New Roman, serif;"><br /></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">You can refer a sample <a href="https://github.com/mccraigmccraig/opennlp/blob/master/src/test/resources/opennlp/tools/namefind/AnnotatedSentencesWithTypes.txt">dataset</a>
for example. Training data should have at least 15000 sentences to get the
better results.<o:p></o:p></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Model can be trained via command line tool as well as Java
Training API : <span style="font-size: 14pt;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;"><br /></span></span></div>
<div class="MsoNormal">
<b><span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Command
Line tool : <span style="font-size: 14pt;"><o:p></o:p></span></span></span></b></div>
<div class="MsoNormal">
<span style="line-height: 18.3999996185303px;"><span style="font-family: Times, Times New Roman, serif;">There are various argument which you need to pass while building the model as follows : </span></span></div>
<div style="background: #C6D9F1; border: dashed #006699 1.0pt; mso-background-themecolor: text2; mso-background-themetint: 51; mso-border-alt: dashed #006699 .75pt; mso-element: para-border-div; padding: 0in 0in 0in 0in;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;">$ opennlp
TokenNameFinderTrainer<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;">Usage: opennlp
TokenNameFinderTrainer[.bionlp2004|.conll03|.conll02|.ad] [-resources
resourcesDir] \<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> [-type modelType] [-featuregen
featuregenFile] [-params paramsFile] \<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> [-iterations num] [-cutoff num]
-model modelFile -lang language \<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -data sampleData [-encoding
charsetName]<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;">Arguments description:<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -resources resourcesDir<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> The resources directory<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -type modelType<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;">
The type of the token name finder model<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -featuregen featuregenFile<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> The feature generator
descriptor file<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -params paramsFile<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> training parameters file.<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -iterations num<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> number of training iterations,
ignored if -params is used. Default value is 100.<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -cutoff num<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> minimal number of times a
feature must be seen, ignored if -params is used. Default value is 5.<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -model modelFile<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> output model file.<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -lang language<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> language which is being processed.<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -data sampleData<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> data to be used, usually a file
name.<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> -encoding charsetName<o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-font-size: 8.0pt; mso-bidi-theme-font: minor-latin; mso-fareast-font-family: "Times New Roman"; mso-font-width: 99%;"><span style="font-family: Times, Times New Roman, serif;"> encoding for reading and
writing text, if absent the system default is used.<o:p></o:p></span></span></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Now lets say, we want to build a model “en-ner-drugs.bin” for
data “drugsDetails.txt” in English language. <span style="font-size: 14pt;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;"><br /></span></span></div>
<div style="background: #C6D9F1; border: dashed #006699 1.0pt; mso-background-themecolor: text2; mso-background-themetint: 51; mso-border-alt: dashed #006699 .75pt; mso-element: para-border-div; padding: 0in 0in 0in 0in;">
<pre style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; padding: 0in;"><span style="font-family: Times, Times New Roman, serif;">$opennlp TokenNameFinderTrainer -model en-ner-drugs.bin -lang en -data drugsDetails.txt -encoding UTF-8<span style="font-size: 11pt;"><o:p></o:p></span></span></pre>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Now we’ll see, how can
we train the same model using JAVA API.<o:p></o:p></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;"><br /></span></span></div>
<div class="MsoNormal">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="line-height: 115%;">Steps</span></b><span style="line-height: 115%;"> :</span></span></div>
<div class="MsoNormal">
</div>
<ul style="text-align: left;">
<li><span style="line-height: 115%; text-indent: -0.25in;"><span style="font-family: Times, Times New Roman, serif;">Open a sample data stream</span></span></li>
<li><span style="line-height: 115%; text-indent: -0.25in;"><span style="font-family: Times, Times New Roman, serif;">Call the NameFinderME.train method</span></span></li>
<li><span style="line-height: 115%; text-indent: -0.25in;"><span style="font-family: Times, Times New Roman, serif;">Save the TokenNameFinderModel to a file</span></span></li>
</ul>
<br />
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Here is the example.<span style="font-size: 14pt;"><o:p></o:p></span></span></span></div>
<div style="background: #C6D9F1; border: dashed windowtext 1.0pt; mso-background-themecolor: text2; mso-background-themetint: 51; mso-border-alt: dashed windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><u><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">i</span></u></b><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">mport</span></b><span style="font-size: 10pt;"> java.io.BufferedOutputStream;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;"> java.io.FileInputStream;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;"> java.io.FileOutputStream;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;"> java.io.IOException;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;"> java.nio.charset.Charset;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;"> java.util.Collections;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;"> java.util.HashMap;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;">
opennlp.tools.namefind.NameFinderME;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;">
opennlp.tools.namefind.NameSample;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;">
opennlp.tools.namefind.NameSampleDataStream;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;">
opennlp.tools.namefind.TokenNameFinderModel;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;">
opennlp.tools.util.ObjectStream;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">import</span></b><span style="font-size: 10pt;">
opennlp.tools.util.PlainTextByLineStream;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">public</span></b><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">class</span></b><span style="font-size: 10pt;"> DrugsClassifierTrainer {</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;"> </span><b style="font-family: Times, 'Times New Roman', serif;"><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">static</span></b><span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;"> String </span><i style="font-family: Times, 'Times New Roman', serif;"><span style="color: #0000c0; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">onlpModelPath</span></i><span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;"> = </span><span style="color: #2a00ff; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">"en-ner-drugs.bin"</span><span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;">;</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> </span><span style="color: #3f7f5f; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">// training data set</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">static</span></b><span style="font-size: 10pt;"> String </span><i><span style="color: #0000c0; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">trainingDataFilePath</span></i><span style="font-size: 10pt;"> = </span><span style="color: #2a00ff; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">"D:/NLPTools/Datasets/drugsDetails.txt"</span><span style="font-size: 10pt;">;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">public</span></b><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">static</span></b><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">void</span></b><span style="font-size: 10pt;"> main(String[] args) </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">throws</span></b><span style="font-size: 10pt;"> IOException {</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> Charset
charset = Charset.<i>forName</i>(</span><span style="color: #2a00ff; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">"UTF-8"</span><span style="font-size: 10pt;">);</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> ObjectStream<String>
lineStream = </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">new</span></b><span style="font-size: 10pt;"> PlainTextByLineStream(</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">new</span></b><span style="font-size: 10pt;"> FileInputStream(</span><i><span style="color: #0000c0; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">trainingDataFilePath</span></i><span style="font-size: 10pt;">),
charset);</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> ObjectStream<NameSample>
sampleStream = </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">new</span></b><span style="font-size: 10pt;"> NameSampleDataStream(</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> lineStream);</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;"> TokenNameFinderModel
model = </span><b style="font-family: Times, 'Times New Roman', serif;"><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">null</span></b><span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;">;</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> HashMap<String,
Object> mp = </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">new</span></b><span style="font-size: 10pt;"> HashMap<String, Object>();</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">try</span></b><span style="font-size: 10pt;"> {</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;"> model
= NameFinderME.<i>train</i>(</span><span style="color: #2a00ff; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">"en"</span><span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;">, </span><span style="color: #2a00ff; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">"drugs"</span><span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;">, sampleStream, Collections.<String,Object> emptyMap(),100,4);</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;"> }
</span><b style="font-family: Times, 'Times New Roman', serif;"><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">finally</span></b><span style="font-family: Times, 'Times New Roman', serif; font-size: 10pt;"> {</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> sampleStream.close();</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> }</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> BufferedOutputStream
modelOut = </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">null</span></b><span style="font-size: 10pt;">;</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">try</span></b><span style="font-size: 10pt;"> {</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> modelOut
= </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">new</span></b><span style="font-size: 10pt;"> BufferedOutputStream(</span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">new</span></b><span style="font-size: 10pt;"> FileOutputStream(</span><i><span style="color: #0000c0; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">onlpModelPath</span></i><span style="font-size: 10pt;">));</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> model.serialize(modelOut);</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> }
</span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">finally</span></b><span style="font-size: 10pt;"> {</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">if</span></b><span style="font-size: 10pt;"> (modelOut != </span><b><span style="color: #7f0055; font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;">null</span></b><span style="font-size: 10pt;">)</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> modelOut.close();</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> }</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;"> }</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: Times, Times New Roman, serif;"><span style="font-size: 10pt;">}</span><span style="font-size: 10.0pt; mso-bidi-font-family: Calibri; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></span></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Above code will generate the “en-ner-drugs.bin” model.<o:p></o:p></span></span></div>
<br />
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">Now you are all set to use this model for finding entity like
other NER models…!!!!!!!! <span style="font-size: 14pt;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;"><br /></span></span></div>
<div class="MsoNormal">
<span style="line-height: 115%;"><span style="font-family: Times, Times New Roman, serif;">For more details, you can go through <a href="https://opennlp.apache.org/documentation/1.5.3/manual/opennlp.html#tools.namefind.training" target="_blank">OpenNLP Documentation</a>.</span></span></div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com34tag:blogger.com,1999:blog-7459259550976670934.post-47225204285052594292015-04-15T22:35:00.001+05:302016-04-28T15:04:42.238+05:30How to run Apache Spark on Windows7 in standalone mode<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
So far, we might have done setup of Spark with Hadoop, EC2
or mesos on Linux machine. But what if
we don’t want with Hadoop/EC2, we just want to run it in standalone mode on
windows.</div>
<div class="MsoNormal">
Here we’ll see how we can run Spark on Windows machine.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b>Prerequisites: </b></div>
<div class="MsoListParagraphCxSpFirst" style="mso-list: l1 level1 lfo1; text-indent: -.25in;">
</div>
<ul style="text-align: left;">
<li><span style="text-indent: -0.25in;">Java6+</span></li>
<li><span style="text-indent: -0.25in;">Scala 2.10.x</span></li>
<li><span style="text-indent: -0.25in;">Python 2.6 +</span></li>
<li><span style="text-indent: -0.25in;">Spark 1.2.x</span></li>
<li><span style="text-indent: -0.25in;">sbt ( In case of building Spark Source code)</span></li>
<li><span style="text-indent: -0.25in;">GIT( If you use sbt tool)</span></li>
</ul>
<!--[if !supportLists]--><br />
<div class="MsoNormal">
Now we’ll see the installation steps :</div>
<div class="MsoListParagraphCxSpFirst" style="margin-left: 22.5pt; mso-add-space: auto; mso-list: l0 level1 lfo2; tab-stops: 13.5pt; text-indent: -22.5pt;">
</div>
<ul style="text-align: left;">
<li><span style="text-indent: -22.5pt;">Install Java 7 or later. Set JAVA_HOME and PATH
variable as environment variables.</span></li>
<li><span style="text-indent: -13.5pt;">Download </span><a href="http://www.scala-lang.org/download/2.10.5.html" style="text-indent: -13.5pt;">Scala 2.10</a><span style="text-indent: -13.5pt;"> or Scala 2.11 and
install. Set </span><b style="text-indent: -13.5pt;">SCALA_HOME</b><span style="text-indent: -13.5pt;"> and add </span><b style="text-indent: -13.5pt;">%SCALA_HOME%\bin</b><span style="text-indent: -13.5pt;">
in PATH variable in environment variables. To test whether Scala is installed
or not, run following command</span><span style="text-indent: -13.5pt;">. </span><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgawG3M6a0id6ImlQiW9A-5E5T6wOF96AMpb40WCtSNLILlAL4g7LA2fD67_QM2XHVTgHGAhEM9-vmINb9SnOy3jvkY1aPx4ppTkVCtuYYkwl-WpthmD8fYKNUvohtd5wfJ9Favep8u3mRK/s1600/scalaversion.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgawG3M6a0id6ImlQiW9A-5E5T6wOF96AMpb40WCtSNLILlAL4g7LA2fD67_QM2XHVTgHGAhEM9-vmINb9SnOy3jvkY1aPx4ppTkVCtuYYkwl-WpthmD8fYKNUvohtd5wfJ9Favep8u3mRK/s1600/scalaversion.png" /></a></li>
<li><span style="text-indent: -13.5pt;">Next thing is Spark. Spark can be installed in
two ways.</span></li>
</ul>
<ul style="text-align: left;"><ul>
<li><span style="font-family: "courier new"; text-indent: -0.25in;"><span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Building Spark using SBT</span></li>
<li><span style="font-family: "courier new"; text-indent: -0.25in;"><span style="font-family: "times new roman"; font-size: 7pt; font-stretch: normal;"> </span></span><span style="text-indent: -0.25in;">Use Prebuilt Spark package</span></li>
</ul>
</ul>
<div class="MsoNormal">
<o:p><br /></o:p></div>
<div class="MsoNormal">
<o:p> </o:p><b><u>Building Spark with
SBT</u> :</b></div>
<ul style="text-align: left;">
<li><span style="text-indent: -13.5pt;">Download </span><a href="http://www.scala-sbt.org/download.html" style="text-indent: -13.5pt;">SBT</a><span style="text-indent: -13.5pt;"> and install. Set SBT_HOME
and PATH variable in environment variables.</span></li>
<li><span style="text-indent: -13.5pt;">Download </span><a href="http://spark.apache.org/downloads.html" style="text-indent: -13.5pt;">source code</a><span style="text-indent: -13.5pt;"> from Spark
website against any of the Hadoop version.</span></li>
<li><span style="text-indent: -13.5pt;">Run </span><b style="text-indent: -13.5pt;">sbt
assembly </b><span style="text-indent: -13.5pt;">command to build the </span><a href="http://spark.apache.org/docs/1.2.1/building-spark.html#building-with-sbt" style="text-indent: -13.5pt;">Spark
package</a></li>
<li><span style="text-indent: -13.5pt;">You need to set Hadoop version also while
building as follows : </span></li>
</ul>
<blockquote class="tr_bq">
<ul style="text-align: left;">
<li style="text-align: left; text-indent: -18px;"><span style="text-align: center; text-indent: 4.5pt;"><span style="background-attachment: initial; background-clip: initial; background-color: #f2f2f2; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border-image-outset: initial; border-image-repeat: initial; border-image-slice: initial; border-image-source: initial; border-image-width: initial; border: 1pt solid windowtext; padding: 0in;"> <b> sbt –Pyarn –pHadoop 2.3 assembly </b></span></span></li>
</ul>
</blockquote>
<div class="MsoNormal" style="margin-left: 1.0in;">
<br /></div>
<div class="MsoNormal">
<b><u>Using Spark Prebuilt
Package</u>:<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpFirst" style="mso-list: l2 level1 lfo3; text-indent: -.25in;">
</div>
<ul style="text-align: left;">
<li><span style="text-indent: -0.25in;">Choose a </span><a href="http://spark.apache.org/downloads.html" style="text-indent: -0.25in;">Spark</a><span style="text-indent: -0.25in;"> prebuilt package for
Hadoop i.e.</span><span style="text-indent: -0.25in;"> </span><b style="text-indent: -0.25in;">Prebuilt for Hadoop 2.3/2.4 or later</b><span style="text-indent: -0.25in;">. Download and extract it to
any drive i.e. D:\spark-1.2.1-bin-hadoop2.3</span></li>
<li><span style="text-indent: -0.25in;">Set </span><b style="text-indent: -0.25in;">SPARK_HOME</b><span style="text-indent: -0.25in;">
and add </span><b style="text-indent: -0.25in;">%SPARK_HOME%\bin </b><span style="text-indent: -0.25in;">in PATH in
environment variables</span></li>
<li><span style="text-indent: -0.25in;">Run following command on command line.</span><o:p> </o:p><a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbw53-3juJXT3jfLevE4z7IiPHx8ULPaK-KQbNcZS3PlirmFnF9KRyuHbnAsp7lbePwmpck48g0k2hv2F7zrySTJmKnSNwoYiJXn8uCWMdrOB9SoKsm1GvWkmLMcQnggiWWRzCUEzVJTX9/s1600/spark-shell.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjbw53-3juJXT3jfLevE4z7IiPHx8ULPaK-KQbNcZS3PlirmFnF9KRyuHbnAsp7lbePwmpck48g0k2hv2F7zrySTJmKnSNwoYiJXn8uCWMdrOB9SoKsm1GvWkmLMcQnggiWWRzCUEzVJTX9/s1600/spark-shell.png" /></a></li>
<li>You’ll get and error for winutils.exe:<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfj4eTz3xjYEkJaudnxK8IpgPzusQCn0Ts8WmKRQdPd1wiPap_Jeo3mdS7mrui5sBeTq25tabT-i8mFmh5fQrpxzIQqzNmfu2g1isoJg7yZMMbd_GoxhHUB0rSPXK0UwoZURfWmrtBuUxt/s1600/error.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgfj4eTz3xjYEkJaudnxK8IpgPzusQCn0Ts8WmKRQdPd1wiPap_Jeo3mdS7mrui5sBeTq25tabT-i8mFmh5fQrpxzIQqzNmfu2g1isoJg7yZMMbd_GoxhHUB0rSPXK0UwoZURfWmrtBuUxt/s1600/error.png" /></a></li>
</ul>
<div class="MsoListParagraphCxSpLast">
<b> </b> Though we aren’t using Hadoop with Spark, but somewhere it
checks for HADOOP_HOME variable in configuration. So to overcome this error,
download <a href="https://github.com/steveloughran/winutils/blob/master/hadoop-2.6.0/bin/winutils.exe" target="_blank">winutils.exe</a>
and place it in any location (i.e. D:\winutils\bin\winutils.exe).<br />
<br />
<b>P.S.</b> As per the Operating system version, this winutils.exe may vary. So in case, if it doesn't support to your OS, please find another one and use. You can refer this <a href="https://wiki.apache.org/hadoop/WindowsProblems" target="_blank">Problems running Hadoop on Windows</a> link for winutils.exe.<br />
<br /></div>
<div class="MsoNormal">
</div>
<ul style="text-align: left;">
<li>Set <b>HADOOP_HOME =
D:\winutils</b> in environment variable</li>
<li>Now, Re run the command “spark-shell’ , you’ll see the scala
shell. For latest spark releases, if you get the permission error for /tmp/hive directory as given below:</li>
<span style="background-color: #cccccc;">The root scratch dir: /tmp/hive on HDFS should be writable. Current permissions are: rw-rw-rw-</span> </ul>
<ul style="text-align: left;"> You need to run following command :</ul>
<div style="background: #EEECE1; border: solid windowtext 1.0pt; margin-left: .5in; margin-right: 0in; mso-background-themecolor: background2; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b>D:\spark>D:\winutils\bin\winutils.exe
chmod 777 D:\tmp\hive</b><b><span style="font-family: "times new roman" , serif; font-size: 14pt;"><o:p></o:p></span></b></div>
</div>
<ul style="text-align: left;">
<li>For Spark UI : open <a href="http://localhost:4040/">http://localhost:4040/</a>
in browser</li>
<li>For testing the successful setup you can run the example :<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivZIKHUDlUtxkjUPf6ER3JWjzs_AHCTXMfTLKJzDOFaT9gFftRN6Oi-mkSFOtjw9fOr724aNNqsdK20agQlb7IsLiap1QQhBp_WlnPA6Lcsj_uoVI5j9wL8D_R_Wf9bVHpL31ePmwSKbDj/s1600/runexample.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEivZIKHUDlUtxkjUPf6ER3JWjzs_AHCTXMfTLKJzDOFaT9gFftRN6Oi-mkSFOtjw9fOr724aNNqsdK20agQlb7IsLiap1QQhBp_WlnPA6Lcsj_uoVI5j9wL8D_R_Wf9bVHpL31ePmwSKbDj/s1600/runexample.png" /></a> </li>
<li>It will execute the program and return the result :<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4El-XPcAs1JpPSyrujIDIx5E4Xu_To5uvTnWC68AOveIz5JuF1SDXNcw16R4yXK4uOMbnls657wil6iGfnf2EgsawGnEZ7Obn7uJbDnuYwm31Cgtxx5WRoV9rQWrgg1vmttmsRWQHmLoQ/s1600/result.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em; text-align: center;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEh4El-XPcAs1JpPSyrujIDIx5E4Xu_To5uvTnWC68AOveIz5JuF1SDXNcw16R4yXK4uOMbnls657wil6iGfnf2EgsawGnEZ7Obn7uJbDnuYwm31Cgtxx5WRoV9rQWrgg1vmttmsRWQHmLoQ/s1600/result.png" /></a></li>
</ul>
<div class="MsoNormal">
Now your cluster is successfully launched, start writing
your Java/Python/Scala programs…!!!! <span style="font-family: "wingdings";">J</span></div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com185tag:blogger.com,1999:blog-7459259550976670934.post-55566351795824150782015-02-21T01:41:00.000+05:302015-02-21T01:41:10.822+05:30Writing Java UDF in Apache Pig<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Isn’t it good that we can write user defined functions (UDF)
for custom processing in Pig also? Here we’ll talk about writing UDF in java.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<u><b>How to write the Java UDF:</b><o:p></o:p></u></div>
<div class="MsoNormal">
First of all, add pig dependency in the java project.</div>
<div class="MsoNormal">
Now, define a UDF class (eg. HexConversion) . Each UDF will
extend <b>EvalFunc<T></b> class. Here ‘T’ denotes the return type i.e.
DataByteArray, DataBag,Tuple,String etc.
</div>
<div class="MsoNormal">
The <b>exec(Tuple input)</b> method is implemented in the UDF which is invoked on every input tuple. It
takes tuple with input parameters in the order they are passed to function in
the Pig Script.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Here in the following example, we are writing UDF to convert entire tuple
into hexadecimal.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;">package com.test.udf;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;">import
java.io.IOException;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;">import
org.apache.pig.EvalFunc;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;">import
org.apache.pig.data.DataByteArray;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;">import
org.apache.pig.data.Tuple;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<br /></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;">public class HexConversion
extends EvalFunc<DataByteArray> {<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> /**<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> *
UDF to convert ASCII to hexadecimal.It returns the string into Hex format as
DataByteArray <o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> */<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> public DataByteArray exec(final Tuple input) throws
IOException {<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> DataByteArray output = new DataByteArray();<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> if (input == null) {<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> output = null;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> try {<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> final String str =
input.get(0).toString();<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> String code;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> int strlength = str.length();<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> StringBuilder builder = new
StringBuilder();<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> char[] charArr = new char[strlength];<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> for (int i = 0; i < str.length();
i++) {<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> char ch = str.charAt(i);<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> code =
Integer.toHexString(ch).toUpperCase();<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> charArr[i] = code;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> <o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> builder.append(charArr);<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> output.append(builder.toString());<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> } catch (final Exception e) {<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> output.append(new byte[0]);<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> return output;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
</div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242; text-indent: 13.5pt;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;">}<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<br /></div>
<div class="MsoNormal">
<b><span style="font-size: 12.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"><u>Schema</u>:<o:p></o:p></span></b></div>
<div class="MsoNormal">
In case of Tuple or DataBag return type, Schema information
needs to be passed explicitly in outputSchema method. You need to import following two classes and implement this
method:</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<b><span style="color: #444444;">import
org.apache.pig.impl.logicalLayer.schema.Schema;<o:p></o:p></span></b></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<b><span style="color: #444444;">import
org.apache.pig.data.DataType;</span></b><span style="color: #17365d;"><o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<br /></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> public Schema
outputSchema(Schema input) {<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> try{<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> Schema tupleSchema = new Schema();<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> tupleSchema.add(input.getField(1));<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> tupleSchema.add(input.getField(0));<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> return new Schema(new Schema.FieldSchema(getSchemaName(this.getClass().getName().toLowerCase(), input),tupleSchema, DataType.TUPLE));<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> }catch (Exception e){<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> return null;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
Build the above UDF as a jar file : hexConvertor.jar</div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
Now let’s see how to call this UDF in PigScript. Register the jar file and call the method.</div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> REGISTER hexConvertor.jar;<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> A = LOAD 'sample_data' AS
(field1: bytearray, age: int);<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> B = FOREACH A GENERATE com.test.udf.</span><span style="font-family: 'Times New Roman', serif; line-height: 18.3999996185303px; text-indent: 18px;">HexConversion</span><span style="font-family: 'Times New Roman', serif; line-height: 115%;">(field1);<o:p></o:p></span></div>
<div class="MsoNormal">
</div>
<div class="MsoNormal" style="background: #F2F2F2; margin-bottom: .0001pt; margin-bottom: 0in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: 'Times New Roman', serif; line-height: 115%;"> DUMP B</span><span style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 115%;">;<o:p></o:p></span></div>
<div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0in;">
Now you can also write your own UDF. Cheers..!!!</div>
</div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com11tag:blogger.com,1999:blog-7459259550976670934.post-73781481827780606402015-02-20T12:02:00.002+05:302015-02-20T23:22:39.594+05:30Penn Treebank POS Tags in Natural Language Processing<div dir="ltr" style="text-align: left;" trbidi="on">
Part-of-speech(POS) tags are the most common things to be used in Natural Language processing. Let's say we are parsing a sentence and following is the result parse tree:<br />
<br />
<div style="margin-bottom: .0001pt; margin: 0in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">(TOP </span></span><span style="color: #3d85c6; font-size: 9pt; text-indent: 0.5in;">(SBARQ </span><span style="color: #3d85c6; font-size: 9pt; text-indent: 0.5in;">(WHADVP (WRB How))</span></div>
<div style="margin: 0in 0in 0.0001pt 0.5in; text-indent: 0.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">(SQ (VBZ is)<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">
(NP<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1in; text-indent: 0.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;"> (NP (DT the)
(NN author) )<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;"> (PP (IN of)<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">
(NP<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">
(NP (DT The) (NNP Call) )<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">
(PP (IN of)<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">
(NP (DT the) (NNP Wild?) )<br />
)<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;">
)<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt 1.5in;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;"> )<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;"> )<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;"> )<o:p></o:p></span></span></div>
<div style="margin: 0in 0in 0.0001pt;">
<span style="font-size: 9pt;"><span style="color: #3d85c6;"> )</span></span><span style="color: #3d85c6; font-size: 9pt;">)</span></div>
<div style="margin: 0in 0in 0.0001pt;">
<span style="color: #3d85c6; font-size: 9pt;"><br /></span></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
</div>
<div class="MsoNormal">
Now the problem arises : What do these pos tags(i.e. <span style="font-size: x-small;">SBARQ,WHADVP,VBZ</span> etc.) stand for? </div>
So to avoid this situation, I am consolidating all the pos tags here :<br />
<br />
<table border="0" cellpadding="0" class="MsoNormalTable">
<tbody>
<tr>
<td style="background: #17365D; mso-background-themecolor: text2; mso-background-themeshade: 191; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"><span style="color: white;">Tag<o:p></o:p></span></span></div>
</td>
<td style="background: #17365D; mso-background-themecolor: text2; mso-background-themeshade: 191; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"><span style="color: white;">Description</span><o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">CC<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Coordinating conjunction<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">CD<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Cardinal number<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">DT<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Determiner<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">EX<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Existential <i>there</i><o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">FW<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Foreign word<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">IN<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Preposition or subordinating
conjunction<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">JJ<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Adjective<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">JJR<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Adjective, comparative<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">JJS<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Adjective, superlative<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">LS<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">List item marker<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">MD<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Modal<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">NN<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Noun, singular or mass<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">NNS<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Noun, plural<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">NNP<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Proper noun, singular<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">NNPS<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Proper noun, plural<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">PDT<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Pre determiner<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">POS<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Possessive ending<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">PRP<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Personal pronoun<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">PRP$<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Possessive pronoun<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">RB<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Adverb<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">RBR<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Adverb, comparative<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">RBS<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Adverb, superlative<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">RP<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Particle<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">S<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Simple declarative clause, i.e.
one that is not introduced by a (possible empty) subordinating conjunction or
a wh-word and that does not exhibit subject-verb inversion<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">SBAR<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Clause introduced by a (possibly
empty) subordinating conjunction<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">SBARQ<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Direct question introduced by a
wh-word or a wh-phrase. Indirect questions and relative clauses should be
bracketed as SBAR, not SBARQ<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">SINV<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Inverted declarative sentence,
i.e. one in which the subject follows the tensed verb or modal.</span><o:p></o:p></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">SQ<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"> Inverted yes/no question, or main clause of
a wh-question, following the wh-phrase in SBARQ.</span><o:p></o:p></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">SYM<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Symbol<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">VBD<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Verb, past tense<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">VBG<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Verb, gerund or present participle<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">VBN<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Verb, past participle<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">VBP<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Verb, non-3rd person singular
present<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">VBZ<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Verb, 3rd person singular present<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">WDT<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Wh-determiner<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">WP<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Wh-pronoun<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">WP$<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Possessive wh-pronoun<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt;"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">WRB<o:p></o:p></span></div>
</td>
<td style="background: #F2F2F2; mso-background-themecolor: background1; mso-background-themeshade: 242; padding: 1.5pt 1.5pt 1.5pt 1.5pt; width: 287.3pt;" width="383"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; mso-bidi-font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Wh-adverb<o:p></o:p></span></div>
</td>
</tr>
</tbody></table>
<br />
<div class="MsoNormal">
<span style="font-size: 10.0pt; line-height: 115%; mso-bidi-font-size: 11.0pt;"> w</span>ill keep updating the list with new pos tags. Cheers!!!</div>
<br /></div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com3tag:blogger.com,1999:blog-7459259550976670934.post-33607815828166642212014-07-04T11:21:00.001+05:302014-07-04T11:22:37.836+05:30Update a running topology on Storm Cluster<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Sometimes, we want to update any running topology based on
some given conditions or rules. As of now, <a href="https://storm.incubator.apache.org/documentation/Running-topologies-on-a-production-cluster.html" target="_blank">storm</a> doesn’t have any direct
command or code to update it, so for that there are two approaches.<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b>First approach</b> : Kill
that topology from the command-line using :</div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div style="background: #D6E3BC; border: solid windowtext 1.0pt; mso-background-themecolor: accent3; mso-background-themetint: 102; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 0in 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; padding: 0in;">
storm kill
<topology-name><o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
And re-run. But what if we don’t to kill it manually and
that should be automatically handled in the code.</div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal">
<b><br /></b></div>
<div class="MsoNormal">
<b>Second Approach</b>: Use <a href="http://storm.incubator.apache.org/apidocs/backtype/storm/generated/Nimbus.Client.html" target="_blank">NimbusClient</a>
to kill the topology programmatically and start again.</div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div style="background: #D6E3BC; border: solid windowtext 1.0pt; mso-background-themecolor: accent3; mso-background-themetint: 102; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 1.0pt 1.0pt 1.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">Map storm_conf =
Utils.readStormConfig();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">Client client =
NimbusClient.getConfiguredClient(storm_conf).getClient();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">Iterator<TopologySummary>
topologyList = client.getClusterInfo().get_topologies_iterator();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">if
(topologyNameExists(storm_conf, topologyName)) {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> client.killTopology(topologyName);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">}</span></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Above code will kill the topology.
But topology takes some time to get cleared from list, therefore if you immediately
start the same topology, it’ll throw the exception “<b>Same name topology exists on the cluster</b>”, so you need to check for
few seconds.</div>
<div class="MsoNormal">
<o:p></o:p></div>
<div class="MsoNormal">
Here is the running example:<o:p></o:p></div>
<div class="MsoNormal">
<o:p> </o:p><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;"> </span></b></div>
<div style="background: #D6E3BC; border: solid windowtext 1.0pt; mso-background-themecolor: accent3; mso-background-themetint: 102; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 1.0pt 1.0pt 1.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.Config;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.LocalCluster;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.StormSubmitter;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.generated.AlreadyAliveException;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> backtype.storm.generated.ClusterSummary;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.generated.TopologySummary;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.generated.Nimbus.Client;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.spout.SchemeAsMultiScheme;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.topology.TopologyBuilder;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.utils.NimbusClient;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
backtype.storm.utils.Utils;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">public</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">class</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> TestTopology {</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">public</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">static</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">void</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> main(String[]
args) </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">throws</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> Exception {</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> String topologyName = </span><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt;">"testTopology"</span><span style="font-family: 'Courier New'; font-size: 10pt;">;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> TopologyBuilder builder = </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">new</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
TopologyBuilder();</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> builder.setSpout(</span><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt;">"testspout"</span><span style="font-family: 'Courier New'; font-size: 10pt;">, </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">new</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> TestSpout(),
1);</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> builder.setBolt(</span><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt;">"testbolt"</span><span style="font-family: 'Courier New'; font-size: 10pt;">,</span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">new</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> TestBolt(), 1)</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> .shuffleGrouping(</span><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt;">"testspout"</span><span style="font-family: 'Courier New'; font-size: 10pt;">);</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> Config conf = </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">new</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> Config();</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> conf.setDebug(</span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">false</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">);</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> Map storm_conf =
Utils.readStormConfig();</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> Client client =
NimbusClient.getConfiguredClient(storm_conf)</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> .getClient();</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> Iterator<TopologySummary>
topologyList = client.getClusterInfo()</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> .get_topologies_iterator();</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">if</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
(topologyNameExists(storm_conf, topologyName)) {</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> client.killTopology(topologyName);</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> } </span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">boolean</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> flag = </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">true</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">while</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> (flag) {</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">if</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
(topologyNameExists(storm_conf, topologyName)) {</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> flag = </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">true</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> } </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">else</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> {</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> flag = </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">false</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">;</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> }</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> }<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: "Courier New"; font-size: 10.0pt;"> TopologyBuilder builder = <b><span style="color: #7f0055;">new</span></b> TopologyBuilder();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">try</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">{</span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">
StormSubmitter.submitTopology(topologyName, conf,</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> builder.createTopology());</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> } </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt;">catch</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">
(AlreadyAliveException ae) {</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> ae.printStackTrace();</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> }</span><span style="font-family: "Courier New"; font-size: 10.0pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> Thread.sleep(60000);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in; text-indent: 0.5in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">}<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">}</span><o:p></o:p></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Above code will start the
Topology (if not running), otherwise will kill it and restart. You can monitor this from storm ui.</div>
<div class="MsoNormal">
<o:p></o:p></div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com3tag:blogger.com,1999:blog-7459259550976670934.post-51785714543273826592014-06-20T22:10:00.000+05:302014-06-20T22:10:48.182+05:30Submitting a topology to Remote Storm Cluster<div dir="ltr" style="text-align: left;" trbidi="on">
<div style="margin-bottom: .0001pt; margin: 0in;">
It is very easy to write a topology and submit to the same
Storm Cluster.<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
But problem arises when we need to submit a topology remotely
to remote Storm cluster from a local machine.<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
What should we do in that case?<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Here is the approach to submit a topology to remote
cluster.<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
I have a local windows machine and one Storm Cluster(1
nimbus Linux machine and 2 supervisor Linux machine)<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Let's say following are the machines in cluster :<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Nimbus Machine : 192.168.1.5<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Supervisor Machine 1: 192.168.1.6<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Supervisor Machine 2: 192.168.1.7<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Storm cluster should be up and running on above machine.<o:p></o:p></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Now from local machine, use <a href="http://storm.incubator.apache.org/apidocs/backtype/storm/generated/Nimbus.Client.html">NimbusClient</a> to submit Jar to cluster. <span style="font-size: 13.5pt;"><o:p></o:p></span></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="background: #E1EBF7; border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin: 0in 0in 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 10pt;">NimbusClient nimbus =<span class="apple-converted-space"> </span></span><b><span style="color: #7f0055; font-family: Consolas; font-size: 10.0pt;">new</span></b><span class="apple-converted-space"><span style="font-family: Consolas; font-size: 10pt;"> </span></span><span style="font-family: Consolas; font-size: 10pt;">NimbusClient(storm_conf,</span><span style="color: #2a00ff; font-family: Consolas; font-size: 10.0pt;">"<nimbus machine
ip>"</span><span style="font-family: Consolas; font-size: 10pt;">,<nimbus port>);</span><span class="apple-converted-space"><span style="font-family: Calibri, sans-serif; font-size: 11.5pt;"><o:p></o:p></span></span></div>
<div style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin: 0in 0in 0.0001pt; padding: 0in;">
<span style="font-family: Consolas; font-size: 10pt;">nimbus.getClient().submitTopology(topologyName,uploadedJarLocation,jsonConf,
builder.createTopology());</span><span style="font-size: 13.5pt;"><o:p></o:p></span></div>
</div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
</div>
<div style="margin-bottom: .0001pt; margin: 0in;">
Here is a running example:</div>
<div style="margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div style="background: #E1EBF7; border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">java.util.Map;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">org.json.simple.JSONValue;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">backtype.storm.Config;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">backtype.storm.StormSubmitter;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">backtype.storm.generated.AlreadyAliveException;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">backtype.storm.generated.Nimbus.Client;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">backtype.storm.topology.TopologyBuilder;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">backtype.storm.utils.NimbusClient;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">import</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">backtype.storm.utils.Utils;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<br /></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">public</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">class</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">RunningClusterTopology {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">public</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">static</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">void</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">main(String[] args)</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">throws</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">Exception {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in; text-indent: 0.5in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">TopologyBuilder builder =</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">new</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">TopologyBuilder();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">Config conf =</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">new</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">Config();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">conf.put(Config.</span><i><span style="color: #0000c0; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">NIMBUS_HOST</span></i><span style="font-family: 'Courier New'; font-size: 10pt;">,</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">"192.168.1.5"</span><span style="font-family: 'Courier New'; font-size: 10pt;">);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">conf.setDebug(</span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">true</span></b><span style="font-family: 'Courier New'; font-size: 10pt;">);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<u><span style="font-family: 'Courier New'; font-size: 10pt;">Map</span></u><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">storm_conf = Utils.<i>readStormConfig</i>();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<u><span style="font-family: 'Courier New'; font-size: 10pt;">storm_conf.put(</span></u><u><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">"nimbus.host"</span></u><u><span style="font-family: 'Courier New'; font-size: 10pt;">,</span></u><u><span style="font-family: 'Courier New'; font-size: 10pt;"> </span></u><u><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">"</span></u><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">192.168.1.5<u>"</u></span><u><span style="font-family: 'Courier New'; font-size: 10pt;">)</span></u><span style="font-family: 'Courier New'; font-size: 10pt;">;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">Client</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><u><span style="font-family: 'Courier New'; font-size: 10pt;">client</span></u><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">= NimbusClient.<i>getConfiguredClient</i>(storm_conf)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">.getClient();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">String inputJar =</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">"C:\\workspace\\TestStormRunner\\target\\TestStormRunner-0.0.1-SNAPSHOT-jar-with-dependencies.jar"</span><span style="font-family: 'Courier New'; font-size: 10pt;">;<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">NimbusClient nimbus =</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">new</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">NimbusClient(storm_conf,</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="color: #2a00ff; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">"192.168.1.5"</span><span style="font-family: 'Courier New'; font-size: 10pt;">,<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">6627);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;"> // upload topology jar
to Cluster using StormSubmitter<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">String uploadedJarLocation = StormSubmitter.<i>submitJar</i>(storm_conf,<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">inputJar);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">try</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">{<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">String jsonConf = JSONValue.<i>toJSONString</i>(storm_conf);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">nimbus.getClient().submitTopology("testtopology",<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">uploadedJarLocation, jsonConf, builder.createTopology());<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">}</span><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><b><span style="color: #7f0055; font-family: "Courier New"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";">catch</span></b><span style="font-family: 'Courier New'; font-size: 10pt;"> </span><span style="font-family: 'Courier New'; font-size: 10pt;">(AlreadyAliveException
ae) {<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">ae.printStackTrace();<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">}<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">Thread.<i>sleep</i>(60000);<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">}<o:p></o:p></span></div>
<div class="MsoNormal" style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; background-position: initial; background-repeat: initial; background-size: initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">}<o:p></o:p></span></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
It will submit a topology on Nimbus Machine where it’ll run
on 2 supervisor machines(192.168.1.6 and 192.168.1.7)</div>
<div class="MsoNormal">
To test it, open storm UI in browser : <a href="http://%3Cnimbus-ip%3E:%3cport/">http://<nimbus-ip>:<port</a>></div>
<div style="margin-bottom: .0001pt; margin: 0in;">
</div>
<div class="MsoNormal">
Eg : <a href="http://192.168.1.5:8080/">http://192.168.1.5:8080</a>
</div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com10tag:blogger.com,1999:blog-7459259550976670934.post-47158678274605157182013-11-10T00:15:00.000+05:302013-11-10T00:15:09.370+05:30Singular Value Decomposition(SVD) in R <div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
When we have large number of attributes in a dataset, it’s
hard to identify which of the variables are the most useful to use. Singular
Value Decomposition (SVD) algorithm helps us to reduce the dimensions of data. </div>
<div class="MsoNormal">
Here we’ll learn how to implement SVD. But before that, we must know what does
Dimensional Reduction mean?</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<a href="http://en.wikipedia.org/wiki/Dimensionality_reduction">Dimensional
Reduction</a> is the process of reducing the number of random variables under
consideration as feature selection and feature extraction. There are many ways
to do it. Here we will read about SVD, how it helps in Dimensional Reduction.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Before looking into implementation part, let’s have a brief
overview what SVD is.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<a href="http://en.wikipedia.org/wiki/Singular_value_decomposition">Singular Value
Decomposition</a> (SVD) is a matrix factorization method used in data mining.</div>
<div class="MsoNormal">
In data mining, this algorithm is used to reduce the number
of attributes that are used in a data mining process. This reduction removes
unnecessary data that are linearly dependent in the point of view of Linear Algebra.
For example, imagine a database which contains a field that stores the water's
temperature on several samples and another that stores its state (solid, liquid
or gas). It’s easy to see that the second field is dependent from the first
and, therefore, SVD could easily show us that it is not important for the
analysis.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b><span style="font-size: large;">Algorithm</span>:<u><o:p></o:p></u></b></div>
<div class="MsoNormal">
<b><u><br /></u></b></div>
<div class="MsoNormal">
SVD is the factorization of m x n matrix A, with m > n of
real or complex numbers:</div>
<div class="MsoNormal">
<br /></div>
<div style="background: #DBE5F1; border: solid windowtext 1.0pt; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; padding: 0in;">
A = U S V<sup>T<o:p></o:p></sup></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
Where U is an orthogonal m x n matrix, S is a diagonal
matrix of singular values and V is an orthogonal n x n matrix & </div>
<div class="MsoNormal">
<br /></div>
<div style="background: #DBE5F1; border: solid windowtext 1.0pt; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; padding: 0in;">
U<sup>T</sup>U = V<sup>T</sup>V
= I</div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
where I is an identity matrix.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<b>To compute SVD :</b></div>
<div class="MsoNormal">
</div>
<ul style="text-align: left;">
<li><span style="text-indent: -0.25in;">we find the eigen vectors and eigen values of A</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;">A
and AA</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;">.</span></li>
<li><span style="text-indent: -0.25in;">The eigenvectors of A</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;">A </span><span style="text-indent: -0.25in;"> </span><span style="text-indent: -0.25in;">are the columns of V and the eigenvectors of AA</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;"> are
the columns of U. The singular values of A, in the diagonal of matrix S, are
the square root of the common positive eigenvalues of AA</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;"> and A</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;">A</span><span style="background-color: white; font-family: Arial, sans-serif; font-size: 10pt; line-height: 115%; text-indent: -0.25in;">.</span></li>
<li><span style="font-family: Symbol; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt;"> </span></span><span style="background-color: white; font-family: Arial, sans-serif; font-size: 10pt; line-height: 115%; text-indent: -0.25in;">If </span><span style="text-indent: -0.25in;">AA</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;"> and
A</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;">A have the same number of eigenvalues, then A is a square matrix,
else eigenvalues of the matrix that have less eigen values are the eigenvalues
of the matrix that has more. We can say, that singular values of A are the
eigen values of the matrix, between AA</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;"> and A</span><sup style="text-indent: -0.25in;">T</sup><span style="text-indent: -0.25in;">A with
less number of eigenvalues.</span></li>
<li><span style="text-indent: -0.25in;">Singular values of matrix A is also known as
rank of that matrix that specifies number of linearly independent rows or
columns.</span></li>
<li><span style="text-indent: -0.25in;">Rank should not be greater than min(m,n)</span></li>
</ul>
<br />
<div class="MsoNormal">
Now we have clear idea of SVD, so now we’ll learn how to run
SVD in R.</div>
<div class="MsoNormal">
<b><u><br /></u></b></div>
<div class="MsoNormal">
<b><span style="font-size: large;">How to run SVD in
R:</span><u><o:p></o:p></u></b></div>
<div class="MsoNormal">
<b><u><br /></u></b></div>
<div class="MsoNormal">
Here is the example for how to run in R.</div>
Suppose we have a 4x3 matrix.<br />
<table border="0" cellpadding="0" class="MsoNormalTable" style="mso-cellspacing: 1.5pt; mso-yfti-tbllook: 1184; width: 19px;">
<tbody>
<tr>
<td style="padding: .75pt .75pt .75pt .75pt;" valign="top"></td>
<td style="padding: .75pt .75pt .75pt .75pt;" valign="top"></td>
</tr>
<tr>
<td style="padding: .75pt .75pt .75pt .75pt;" valign="top"></td>
<td style="padding: .75pt .75pt .75pt .75pt;" valign="top"></td>
</tr>
<tr>
<td style="padding: .75pt .75pt .75pt .75pt;" valign="top"></td>
<td style="padding: .75pt .75pt .75pt .75pt;" valign="top"></td>
</tr>
</tbody></table>
<div style="background: #DBE5F1; border: solid windowtext 1.0pt; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">hilbert <-
function(n) { i <- 1:n; 1 / outer(i - 1, i, "+") }<o:p></o:p></span></div>
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; margin: 0in 0in 0.0001pt 91.6pt; padding: 0in; text-indent: -91.6pt;">
<span style="font-family: 'Courier New'; font-size: 10pt;">X <- hilbert(4)[, 1:3]<o:p></o:p></span></div>
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">(s <- svd(X))<o:p></o:p></span></div>
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">D <- diag(s$d)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">s$u %*% D %* t(s$v)
# X = U D V'<o:p></o:p></span></div>
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; margin-bottom: 0.0001pt; padding: 0in;">
<span style="font-family: 'Courier New'; font-size: 10pt;">t(s$u) %*% X %*% s$v
# D = U' X V<o:p></o:p></span></div>
</div>
<div class="MsoNormal">
<br /></div>
Svd(X) returns a list with components :<span style="font-size: 13.5pt;"><o:p></o:p></span><br />
<b>d<span style="font-size: 13.5pt;"> : </span>a vector containing the singular values of x, of length min(n, p).</b><br />
<b>u : a matrix whose columns contain the left singular vectors of x.</b><br />
<b>v : a matrix whose columns contain the right singular vectors of x.</b><br />
<b><br /></b>
<div class="MsoNormal">
<b>s$d</b> is a vector of singular values as below:</div>
<div class="MsoNormal">
<br /></div>
<div style="background: #DBE5F1; border: solid windowtext 1.0pt; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-element: para-border-div; padding: 1.0pt 4.0pt 1.0pt 4.0pt;">
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; border: none; padding: 0in;">
[1] 1.451914187
0.143312317 0.004228883</div>
</div>
<br />
<div class="MsoNormal">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%; mso-fareast-font-family: "Times New Roman";">Using SVD,
we can remove noise and linear independent elements with most important
singular values. This is very useful in data mining.<o:p></o:p></span></div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com2tag:blogger.com,1999:blog-7459259550976670934.post-25658651354168943112013-10-03T11:22:00.001+05:302016-04-25T17:35:23.556+05:30Logistic Regression in Mahout<div dir="ltr" style="text-align: left;" trbidi="on">
<span style="color: black;"><a href="http://en.wikipedia.org/wiki/Logistic_regression">Logistic Regression(LR)</a> is a type of regression analysis used for the prediction of the probability of occurence of an event. It uses several predictors which may be either numerical or categorical.</span><br />
It refers specifically to the problem in which dependent variable is dichotomous. i.e.<br />
<b>Predict whether a patient has a given disease or not,whether user will buy a product or not</b>... etc<br />
<br />
It can be implemented in Mahout as well as in R. Here we'll talk about Mahout implementation.<br />
Mahout implementation uses <b>Stochastic Gradient Descent(SGD)</b> on all large training data sets.<br />
<br />
Following are the steps to run LR:<br />
<br />
<b><span style="font-size: large;"># To train the model</span></b> -<br />
It produces a model based on training data that can be used to classify dataset of specific format. It takes training dataset as input and uses it to produce the target model.<br />
<br />
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="background: #DBE5F1; border-collapse: collapse; border: none; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="border: double windowtext 1.5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.65in;" valign="top" width="638"><div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
$MAHOUT_HOME/bin/mahout org.apache.mahout.classifier.sgd.TrainLogistic
--passes 100 --rate 60 --input
$MAHOUT_HOME/examples/src/main/resources/donut.csv --features 100 --output
output/donutmodel.model --target color
--categories 2 --predictors x y
xx xy yy a b c --types n</div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
</td>
</tr>
</tbody></table>
<br />
"<b>input</b>" : training data<br />
"<b>output</b>" : path to the file where model will be written.<br />
"<b>target</b>" : dependent variable which is to be predicted<br />
"<b>categories</b>" : number of unique possible values that target can be assigned<br />
"<b>predictors</b>" : list of field names that are to be used to predict target variable<br />
"<b>types</b>" : datatypes for the items in predictor list<br />
"<b>passes</b>" : number of passes over the input data<br />
"<b>features</b>" : size of internal feature vector<br />
"<b>lambda</b>" : amount of co-efficient decay to use<br />
"<b>rate</b>" : initial learning rate<br />
<br />
It'll give output like this and one model file will be generated on the given location:<br />
<b><span style="font-size: large;"><br /></span></b>
<br />
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="background: #DBE5F1; border-collapse: collapse; border: none; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: dash-small-gap windowtext .5pt; mso-border-insideh: .5pt dash-small-gap windowtext; mso-border-insidev: .5pt dash-small-gap windowtext; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="border: dashed windowtext 1.0pt; mso-border-alt: dash-small-gap windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.65in;" valign="top" width="638"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
Running on hadoop, using /home/hadoop/hadoop-0.20.203.0/bin/hadoop
and HADOOP_CONF_DIR=<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
MAHOUT-JOB:
/data/dataAnalytics/mahout-distribution-0.7-CUSTOM/mahout-examples-0.7-job.jar<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
13/10/03 11:02:48 WARN driver.MahoutDriver: No
org.apache.mahout.classifier.sgd.TrainLogistic.props found on classpath, will
use command-line arguments only<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
100<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
color ~ 6.214*Intercept Term + 0.894*a + -1.255*b + -26.279*c +
4.623*x + -5.436*xx + 3.050*xy + 6.001*y + -6.190*yy<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
Intercept Term 6.21450<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
a 0.89445<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
b -1.25489<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
c -26.27914<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
x 4.62344<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
xx -5.43578<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
xy 3.04982<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
y 6.00145<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
yy -6.19029<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 4.623441607 0.000000000 0.000000000 6.214498855 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 -5.435784604 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 -26.279139691 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 0.000000000 -1.254893124 0.000000000 0.000000000 -6.190291596 0.000000000 0.894450921 0.000000000 3.049819437 0.000000000 0.000000000 0.000000000 0.000000000 6.001446962 0.000000000 0.000000000<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
13/10/03 11:02:48 INFO driver.MahoutDriver: Program took 616 ms
(Minutes: 0.010266666666666667)</div>
</td>
</tr>
</tbody></table>
<b><span style="font-size: large;"><br /></span></b>
<b><span style="font-size: large;"># To test the model : </span></b><br />
We have generated the model in the first step. Now We'll use that to test our system to see, how accurate it is to classify things.<br />
<br />
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="background: #DBE5F1; border-collapse: collapse; border: none; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="border: double windowtext 1.5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.65in;" valign="top" width="638"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
$MAHOUT_HOME/bin/mahout org.apache.mahout.classifier.sgd.RunLogistic
--input $MAHOUT_HOME/examples/src/main/resources/donut-test.csv --model
output/donutmodel.model --auc –confusion</div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
</td>
</tr>
</tbody></table>
<br />
Output would be like this:<br />
<br />
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="background: #DBE5F1; border-collapse: collapse; border: none; mso-background-themecolor: accent1; mso-background-themetint: 51; mso-border-alt: dash-small-gap windowtext .5pt; mso-border-insideh: .5pt dash-small-gap windowtext; mso-border-insidev: .5pt dash-small-gap windowtext; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="border: dashed windowtext 1.0pt; mso-border-alt: dash-small-gap windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.65in;" valign="top" width="638"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
Running on hadoop, using /home/hadoop/hadoop-0.20.203.0/bin/hadoop
and HADOOP_CONF_DIR=<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
MAHOUT-JOB:
/data/dataAnalytics/mahout-distribution-0.7-CUSTOM/mahout-examples-0.7-job.jar<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
13/10/03 11:03:13 WARN driver.MahoutDriver: No
org.apache.mahout.classifier.sgd.RunLogistic.props found on classpath, will
use command-line arguments only<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
AUC = 0.97<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
confusion: [[24.0, 2.0], [3.0, 11.0]]<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
entropy: [[-0.2, -3.4], [-4.8, -0.1]]<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
13/10/03 11:03:14 INFO driver.MahoutDriver: Program took 130 ms
(Minutes: 0.0021666666666666666)<o:p></o:p></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
</td>
</tr>
</tbody></table>
<br />
where <b>AUC : </b>Area under curve. It ranges from 0 to 1. A value of 0 means it wasn't able to classify the input correctly and a value of 1 means that it was able to classify records correctly. Accordingly, we can see how our model is working.<br />
<br />
<b>confusion </b>: it will give you confusion matrix, from where you can see the prediction.<br />
<br />
Now we can predict our test data from above generated model and can answer the question.<br />
<br />
So start using LR for solving your problems!!!!</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com8tag:blogger.com,1999:blog-7459259550976670934.post-5046032602946191262013-09-23T16:31:00.001+05:302013-09-23T16:33:01.228+05:30Installing a Storm Cluster <div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
From the <a href="http://nishutayaltech.blogspot.in/2013/09/twitter-storm-real-time-hadoop.html">previous
post</a>, you must have clear idea, what Storm is meant for. Now the next step
is to setup the Storm cluster on the machine. <o:p></o:p></div>
<br />
<div class="MsoNormal">
Following are the prerequisites for setting the cluster :<o:p></o:p></div>
<div class="MsoNormal">
</div>
<ul style="text-align: left;">
<li><span style="font-family: Symbol; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt;"> </span></span><span style="text-indent: -0.25in;">Linux Operating system</span></li>
<li><span style="text-indent: -0.25in;"><span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;">Java
6 installed</span></span></li>
<li><span style="text-indent: -0.25in;"><span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;"><span style="font-size: 11pt; line-height: 115%;">Python
installed</span></span></span></li>
</ul>
<div class="MsoNormal">
<b><br /></b></div>
<div class="MsoNormal">
<b><span style="font-size: x-large;">Installation Steps :</span><o:p></o:p></b></div>
<div class="MsoNormal">
Following steps are needed to get a Storm Cluster up and running.</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
</div>
<ul style="text-align: left;">
<li><b style="font-weight: bold;"><span style="font-family: Calibri, sans-serif; line-height: 115%;"><span style="font-size: large;">Set up Zookeeper Cluster</span></span><span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;"> : </span></b><span style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 115%;">Zookeeper
is used as a coordinator in Storm
cluster. You can refer <a href="http://zookeeper.apache.org/doc/r3.3.3/zookeeperAdmin.html">here</a> to
see the installation steps for Zookeeper<b>.</b></span></li>
<li><span style="font-family: Calibri, sans-serif; line-height: 115%;"><b><span style="line-height: 115%;"><span style="font-size: large;">Install native dependencies on Nimbus & Worker
machine </span></span><span style="font-size: 11pt; line-height: 115%;"> </span></b></span><span style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 115%;">There
are some dependencies which are required by Storm i.e. ZeroMQ, JZMQ. These native dependencies are needed only on
Storm cluster. While using Storm in local mode, Storm uses a pure Java
Messaging system so you don’t need to install native dependencies there. But in cluster mode, it’s needed. </span><span style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 115%;"> </span></li>
</ul>
<div>
<b style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 17px;"> </b><b style="font-family: Calibri, sans-serif; line-height: 17px;"><span style="font-size: large;">ZeroMQ 2.1.7 Installation</span></b><span style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 17px;"> : Storm has been tested with ZeroMQ 2.1.7, and this is the recommended ZeroMQ release that you install. You can download it from <a href="http://download.zeromq.org/">here</a></span><span style="background-color: white; font-family: Helvetica, sans-serif; font-size: 11.5pt; line-height: 17px;">. </span><span style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 17px;">Following are the steps for installation.</span></div>
<div>
<span style="font-family: Calibri, sans-serif; font-size: 11pt; line-height: 17px;"><br /></span></div>
<div>
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="background: #D9D9D9; border-collapse: collapse; border: none; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 217; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="background: #F2F2F2; border: solid windowtext 1.0pt; mso-background-themecolor: background1; mso-background-themeshade: 242; mso-border-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.65in;" valign="top" width="638"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="background: white; mso-bidi-font-family: Calibri; mso-bidi-font-size: 11.5pt; mso-bidi-theme-font: minor-latin;">wget </span><a href="http://download.zeromq.org/zeromq-2.1.7.tar.gz">http://download.zeromq.org/zeromq-2.1.7.tar.gz</a><o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>tar –xzf zeromq-2.1.7.tar.gz<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>cd zeromq-2.1.7<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>./configure<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>make<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>sudo make install</b><b><span style="background: white; mso-bidi-font-family: Calibri; mso-bidi-font-size: 11.5pt; mso-bidi-theme-font: minor-latin;"><o:p></o:p></span></b></div>
</td>
</tr>
</tbody></table>
</div>
<div>
<div class="MsoNormal" style="margin-left: .5in;">
<b><span style="font-size: large;"><br /></span></b></div>
<div class="MsoNormal" style="margin-left: .5in;">
<b><span style="font-size: large;">JZMQ Installation</span>: </b>JZMQ is the java binding<b> </b>ZeroMQ. Here are the steps:<o:p></o:p></div>
<div class="MsoNormal" style="margin-left: .5in;">
<br /></div>
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="border-collapse: collapse; border: none; margin-left: .5in; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="background: #F2F2F2; border: solid windowtext 1.0pt; mso-background-themecolor: background1; mso-background-themeshade: 242; mso-border-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.65in;" valign="top" width="638"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>git clone <a href="https://github.com/nathanmarz/jzmq.git">https://github.com/nathanmarz/jzmq.git</a><o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>cd jzmq<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>./autogen.sh<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>./configure<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>make<o:p></o:p></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b>sudo make install</b></div>
</td>
</tr>
</tbody></table>
</div>
<div>
<br /></div>
<div>
<ul style="text-align: left;">
<li><span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;">Download
Storm release from <a href="https://github.com/nathanmarz/storm/downloads">here</a>
and copy it to every machine in cluster
(nimbus and worker machine.</span></li>
<li><span style="font-family: Calibri, sans-serif; line-height: 115%;"><b><span style="font-size: large;"><span style="background-color: white; line-height: 115%;">Configure storm.yaml</span><span style="line-height: 115%;"> </span></span><span style="font-size: 11pt; line-height: 115%;">: </span></b></span><span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;">Storm
release contains a file at the conf/storm.yaml with the default configuration to run the
Storm Daemon.</span></li>
</ul>
<div>
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="background: #F2F2F2; border-collapse: collapse; border: none; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="border: solid windowtext 1.0pt; mso-border-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.15in;" valign="top" width="590"><div class="MsoListParagraphCxSpFirst" style="margin-bottom: 0.0001pt;">
<b>storm.zookeeper.servers:<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b> - "IP_MACHINE_1"<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b> - "IP_MACHINE_2"<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b>storm.local.dir: "/home/hadoop/STORM_DIR"<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b>java.library.path: "/usr/local/lib"<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b>nimbus.host: "IP_OF_NIMBUS_MACHINE"<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b>supervisor.slots.ports:<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b> - 6700<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b> - 6701<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b> - 6702<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b> - 6703<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpLast" style="margin-bottom: 0.0001pt;">
<br /></div>
</td>
</tr>
</tbody></table>
</div>
</div>
<div>
<span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;"> Copy
this storm.yaml into “~/.storm/” location.</span></div>
<div>
<ul style="text-align: left;">
<li><span style="font-family: Calibri, sans-serif;"><span style="font-size: 15px; line-height: 17px;"><span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;">Now
launch all the Storm Daemons.</span></span></span></li>
</ul>
<div>
<table border="1" cellpadding="0" cellspacing="0" class="MsoTableGrid" style="border-collapse: collapse; border: none; mso-border-alt: solid windowtext .5pt; mso-padding-alt: 0in 5.4pt 0in 5.4pt; mso-yfti-tbllook: 1184;">
<tbody>
<tr>
<td style="background: #F2F2F2; border: solid windowtext 1.0pt; mso-background-themecolor: background1; mso-background-themeshade: 242; mso-border-alt: solid windowtext .5pt; padding: 0in 5.4pt 0in 5.4pt; width: 6.65in;" valign="top" width="638"><div class="MsoListParagraphCxSpFirst" style="margin-bottom: 0.0001pt;">
<b>hduser@ubuntu:/usr/local/storm-0.8.1$ bin/storm nimbus &<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-bottom: 0.0001pt;">
<b>hduser@ubuntu:/usr/local/storm-0.8.1$ bin/storm supervisor &<o:p></o:p></b></div>
<div class="MsoListParagraphCxSpLast" style="margin-bottom: 0.0001pt;">
<b>hduser@ubuntu:/usr/local/storm-0.8.1$ bin/storm ui &</b></div>
</td>
</tr>
</tbody></table>
</div>
</div>
<div>
<div class="MsoNormal" style="margin-left: .25in; text-indent: .25in;">
<span style="text-indent: 0.25in;"><br /></span></div>
<div class="MsoNormal" style="margin-left: .25in; text-indent: .25in;">
<span style="text-indent: 0.25in;">Here is the
example to start with Storm Topology: </span></div>
<div class="MsoNormal" style="margin-left: .25in; text-indent: .25in;">
<o:p></o:p></div>
<span style="font-family: "Calibri","sans-serif"; font-size: 11.0pt; line-height: 115%; mso-ansi-language: EN-US; mso-ascii-theme-font: minor-latin; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: AR-SA; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: Calibri; mso-fareast-language: EN-US; mso-fareast-theme-font: minor-latin; mso-hansi-theme-font: minor-latin;"> <a href="https://github.com/nathanmarz/storm-starter">https://github.com/nathanmarz/storm-starter</a></span></div>
<br />
<div style="text-indent: -24px;">
</div>
<div class="MsoListParagraph" style="mso-list: l0 level1 lfo1; text-indent: -.25in;">
<o:p></o:p></div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com4tag:blogger.com,1999:blog-7459259550976670934.post-21385776931959569562013-09-21T15:30:00.002+05:302013-09-21T23:35:00.466+05:30Classification and Regression Trees(CART)<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal">
<div class="MsoNormal">
</div>
<div class="MsoNormal" style="text-align: left;">
<div class="MsoNormal">
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Classification &
Regression Tree is a classification method, technically known as Binary
Recursive Partitioning. It uses historical data to construct Decision trees. Decision
trees are further used for classifying new data.<o:p></o:p></span><br />
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><br /></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Here the point comes : <b>Where should we use CART?<o:p></o:p></b></span><br />
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><b><br /></b></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Sometimes we have
problems where we want answer in
“Yes/No”.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12pt; line-height: 115%;"><b>i.e
: “Is salary greater than 30000?”,” Is it going to rain today?” </b>etc<b><o:p></o:p></b></span><br />
<b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><br /></span></b></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">CART asks “Yes/No”
questions. CART algorithm searches all possible variables and possible values
in order to find the best split (means the question that split the data into
two parts to find the maximum homogeneity )<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Key elements for CART
analysis are :<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpFirst" style="margin-left: 1.0in; mso-add-space: auto; mso-list: l1 level1 lfo1; text-indent: -.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; font-size: 12.0pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;">·<span style="font-family: 'Times New Roman'; font-size: 7pt; line-height: normal;">
</span></span><!--[endif]--><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Split
each node in a tree.<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpMiddle" style="margin-left: 1.0in; mso-add-space: auto; mso-list: l1 level1 lfo1; text-indent: -.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; font-size: 12.0pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;">·<span style="font-family: 'Times New Roman'; font-size: 7pt; line-height: normal;">
</span></span><!--[endif]--><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Decide whether tree is complete or not.<o:p></o:p></span></div>
<div class="MsoListParagraphCxSpLast" style="margin-left: 1.0in; mso-add-space: auto; mso-list: l1 level1 lfo1; text-indent: -.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; font-size: 12.0pt; line-height: 115%; mso-bidi-font-family: Symbol; mso-fareast-font-family: Symbol;">·<span style="font-family: 'Times New Roman'; font-size: 7pt; line-height: normal;">
</span></span><!--[endif]--><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Assign each leaf node to a class outcome<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<div class="separator" style="clear: both; text-align: center;">
</div>
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">It returns the decision
tree as below.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%; mso-no-proof: yes;"><!--[if gte vml 1]><v:shapetype
id="_x0000_t75" coordsize="21600,21600" o:spt="75" o:preferrelative="t"
path="m@4@5l@4@11@9@11@9@5xe" filled="f" stroked="f">
<v:stroke joinstyle="miter"/>
<v:formulas>
<v:f eqn="if lineDrawn pixelLineWidth 0"/>
<v:f eqn="sum @0 1 0"/>
<v:f eqn="sum 0 0 @1"/>
<v:f eqn="prod @2 1 2"/>
<v:f eqn="prod @3 21600 pixelWidth"/>
<v:f eqn="prod @3 21600 pixelHeight"/>
<v:f eqn="sum @0 0 1"/>
<v:f eqn="prod @6 1 2"/>
<v:f eqn="prod @7 21600 pixelWidth"/>
<v:f eqn="sum @8 21600 0"/>
<v:f eqn="prod @7 21600 pixelHeight"/>
<v:f eqn="sum @10 21600 0"/>
</v:formulas>
<v:path o:extrusionok="f" gradientshapeok="t" o:connecttype="rect"/>
<o:lock v:ext="edit" aspectratio="t"/>
</v:shapetype><v:shape id="Picture_x0020_1" o:spid="_x0000_i1025" type="#_x0000_t75"
style='width:367.5pt;height:320.25pt;visibility:visible;mso-wrap-style:square'>
<v:imagedata src="file:///C:\Users\Nishu\AppData\Local\Temp\msohtmlclip1\01\clip_image001.png"
o:title=""/>
</v:shape><![endif]--><!--[if !vml]--><!--[endif]--></span><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEjUQLlft48avDeFq_eiGN4k0RR98BcLA-_JVCm6XOEpMfbZK86Epov1kN4P5NyfnFBo7_N2RxsP_bU-dVkNIhULFYAtrvjj1lNHGtDljYY0gHgLhlDw9R8onz2oUkYl-RZk48NZDOpitE/s1600/classtree.PNG" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" height="348" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgEjUQLlft48avDeFq_eiGN4k0RR98BcLA-_JVCm6XOEpMfbZK86Epov1kN4P5NyfnFBo7_N2RxsP_bU-dVkNIhULFYAtrvjj1lNHGtDljYY0gHgLhlDw9R8onz2oUkYl-RZk48NZDOpitE/s400/classtree.PNG" width="400" /></a></div>
<b><u><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">CART Modeling via rpart<o:p></o:p></span></u></b></div>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";">Classification
& Regression Tree can be generated using </span><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;"><a href="http://cran.r-project.org/web/packages/rpart/index.html">rpart</a></span><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";"> package in R.<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";">Following
are the steps to get the CART model :<o:p></o:p></span></div>
<div class="MsoListParagraph" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 1in; text-indent: -0.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; font-size: 12.0pt; mso-bidi-font-family: Symbol; mso-bidi-font-weight: bold; mso-fareast-font-family: Symbol;">·<span style="font-family: 'Times New Roman'; font-size: 7pt;">
</span></span><!--[endif]--><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Grow a
tree</span></b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";">: Use following to grow a tree</span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">rpart(formula, data,
weights, subset, na.action = na.rpart, method,<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> model = FALSE, x = FALSE, y = TRUE,
parms, control, cost, ...)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<br /></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<a href="http://www.blogger.com/blogger.g?blogID=7459259550976670934" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"></a></div>
<div class="MsoListParagraph" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 1in; text-indent: -0.25in;">
<span style="font-family: Symbol; font-size: 12.0pt; mso-bidi-font-family: Symbol; mso-bidi-font-weight: bold; mso-fareast-font-family: Symbol;">·<span style="font-family: 'Times New Roman'; font-size: 7pt;">
</span></span><!--[endif]--><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Examine
the results based on the model</span></b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";"> - There are some functions that help to test the
result</span></div>
<table border="0" cellpadding="0" class="MsoNormalTable" style="background: #EEECE1; margin-left: 66.0pt; mso-background-themecolor: background2; mso-cellspacing: 1.5pt; mso-padding-alt: 0in 0in 0in 0in; mso-yfti-tbllook: 1184; width: 85%px;">
<tbody>
<tr>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">printcp(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">)</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><o:p></o:p></span></div>
</td>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">display cp table <o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">plotcp(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">)</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><o:p></o:p></span></div>
</td>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">plot
cross-validation results<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">rsq.rpart(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">)</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><o:p></o:p></span></div>
</td>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">plot
approximate R-squared and relative error for different splits (2 plots).
labels are only appropriate for the "anova" method.<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">print(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">)</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><o:p></o:p></span></div>
</td>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">print
results<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">summary(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">)</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><o:p></o:p></span></div>
</td>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">detailed
results including surrogate splits<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">plot(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">)</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><o:p></o:p></span></div>
</td>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">plot
decision tree<o:p></o:p></span></div>
</td>
</tr>
<tr>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">text(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">)</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><o:p></o:p></span></div>
</td>
<td style="padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">label
the decision tree plot<o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 19.5pt; mso-yfti-irow: 7; mso-yfti-lastrow: yes;">
<td style="height: 19.5pt; padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">post(</span></b><i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">fit</span></i><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">,<b>file=)</b><o:p></o:p></span></div>
</td>
<td style="height: 19.5pt; padding: 2.25pt 2.25pt 2.25pt 2.25pt;" valign="top"><div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">create
postscript plot of decision tree<o:p></o:p></span></div>
</td>
</tr>
</tbody></table>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";"> Here fit is the model output of
rpart command.<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<br /></div>
<div class="MsoListParagraph" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 1in; text-indent: -0.25in;">
<!--[if !supportLists]--><span style="font-family: Symbol; font-size: 12.0pt; mso-bidi-font-family: Symbol; mso-bidi-font-weight: bold; mso-fareast-font-family: Symbol;">·<span style="font-family: 'Times New Roman'; font-size: 7pt;">
</span></span><!--[endif]--><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Pruning
the tree </span></b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";">- It helps
in avoiding the overfitting of data. Typically, you will want to select a tree
size that minimizes the cross-validated error, the </span><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">xerror</span></b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"> column printed
by <b>printcp( )</b>.<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 1in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";">Prune the
tree of desired size using <o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 1in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-bidi-font-weight: bold; mso-fareast-font-family: "Times New Roman";">prune(fit,cp=)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in; text-indent: 0.5in;">
<br /></div>
<div class="MsoNormal" style="background-color: white; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in; text-indent: 0.5in;">
<span style="font-size: 12pt;">Here is the example of
classification tree :</span><strong><span style="background-position: initial initial; background-repeat: initial initial; font-size: 12pt; font-weight: normal;"><o:p></o:p></span></strong></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">library(rpart)<o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">dataset <-
read.table("C:\\Users\\Nishu\\Downloads\\bank\\bank.csv",header=T,sep=";")<o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;"># grow tree <o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">fit <- rpart(y ~ .,
method="class", data=dataset )<o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">printcp(fit) # display the results <o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">plotcp(fit) # visualize cross-validation
results <o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">summary(fit) # detailed summary of
splits<o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;"># plot tree <o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">plot(fit,
uniform=TRUE,main="Classification Tree for Bank")<o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">text(fit, use.n=TRUE, all=TRUE, cex=.8)<o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;"># create attractive postscript plot of
tree <o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;">post(fit, file = "c:/tree.ps",
title = "Classification Tree")<o:p></o:p></span></div>
<div class="MsoPlainText" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt;"># prune the tree<br />
pfit<- prune(fit, cp=
fit$cptable[which.min(fit$cptable[,"xerror"]),"CP"])<br />
# plot the pruned tree<br />
plot(pfit, uniform=TRUE,<br />
main="Pruned Classification Tree")<br />
text(pfit, use.n=TRUE, all=TRUE, cex=.8)<br />
post(pfit, file = "c:/ptree.ps",<br />
title = "Pruned Classification<span style="background: #FAFAFA; color: #444444;"> ")</span><o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><br /></span>
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Now we have got the
model. Next step is to predict data based on the trained model.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">First we’ll split the
dataset into trained and testdata with a fixed percentage.</span><br />
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span style="background-position: initial initial; background-repeat: initial initial; font-family: 'Times New Roman', serif; font-size: 12pt;">library(rpart)<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span style="background-position: initial initial; background-repeat: initial initial; font-family: 'Times New Roman', serif; font-size: 12pt;">dataset<-read.table("C:\\Users\\Nishu\\Downloads\\bank\\bank.csv",header=T,sep=";")</span><span style="font-family: 'Times New Roman', serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span background:="" color:black="" f2f2f2="" mso-bidi-font-family:="" mso-fareast-font-family:="" new="" roman="" serif="" style="font-family: "; font-size: 12.0pt;" times="">sub <- sample(nrow(</span><span style="background-position: initial initial; background-repeat: initial initial; font-family: '', serif, '', serif; font-size: 12pt;">dataset), floor(nrow(dataset) * 0.9)) # Here we are taking
90% training data</span><span style="font-family: 'Times New Roman', serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span color:black="" courier="" mso-bidi-font-family:="" mso-fareast-font-family:="" new="" roman="" serif="" style="font-family: "; font-size: 12.0pt;" times="">training <- </span><span style="font-family: '', serif, '', serif; font-size: 12pt;">dataset [sub, ]</span><span style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span color:black="" courier="" mso-bidi-font-family:="" mso-fareast-font-family:="" new="" roman="" serif="" style="font-family: "; font-size: 12.0pt;" times="">testing <- </span><span style="font-family: '', serif, '', serif; font-size: 12pt;">dataset [-sub, ]</span><span style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span style="font-family: '', serif, '', serif; font-size: 12pt;">fit <-
rpart(y ~ ., method="class", data=dataset )</span><span style="font-family: 'Times New Roman', serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">predict(fit,testing,type=”class”)</span><span style="font-family: 'Times New Roman', serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span color:black="" courier="" mso-bidi-font-family:="" mso-fareast-font-family:="" new="" roman="" serif="" style="font-family: "; font-size: 12.0pt;" times=""># to get the confusion matrix</span><span style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin-bottom: 0.0001pt;">
<span color:black="" courier="" mso-bidi-font-family:="" mso-fareast-font-family:="" new="" roman="" serif="" style="font-family: "; font-size: 12.0pt;" times="">out <-
table(predict(fit,testing,type="class"),dataset[-sub,"y"])</span><span style="font-family: 'Courier New'; font-size: 10pt;"><o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<br /></div>
<span style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 115%;">Here confusion matrix
is :</span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> <b> no yes</b><o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> <b>no </b>391 25<o:p></o:p></span></div>
<div class="MsoNormal" style="background-color: #f2f2f2; background-position: initial initial; background-repeat: initial initial; margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> <b>yes </b>13 24<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><br /></span>
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><b># To get the accuracy
and other details</b>, use confusionMatrix method with <a href="http://cran.r-project.org/web/packages/caret/caret.pdf">Caret</a> package<o:p></o:p></span><br />
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><br /></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">library(caret)<o:p></o:p></span></div>
<div class="MsoNormal" style="background: #F2F2F2; margin-left: .5in; mso-background-themecolor: background1; mso-background-themeshade: 242;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">confusionMatrix(out)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<a href="http://www.blogger.com/blogger.g?blogID=7459259550976670934" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"></a><a href="http://www.blogger.com/blogger.g?blogID=7459259550976670934" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><br /></span>
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Output would be :<o:p></o:p></span><br />
<br />
<div class="MsoPlainText" style="background-color: #f2f2f2; margin-left: 0.5in;">
<div class="MsoNormal" style="margin-left: 0.5in; text-indent: 0.5in;">
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 18px;">no yes<o:p></o:p></span></b></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> <b>no</b> 391 25<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> <b>yes</b> 13 24<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<br /></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Accuracy : 0.9101<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> 95% CI : (0.8936, 0.9248)<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> No Information Rate : 0.8968<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> P-Value [Acc > NIR] : 0.05704<o:p></o:p></span></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.blogger.com/blogger.g?blogID=7459259550976670934" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"></a></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Kappa : 0.4005</span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">Mcnemar's Test P-Value : 9.213e-08<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<br /></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Sensitivity : 0.9745<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<a href="http://www.blogger.com/blogger.g?blogID=7459259550976670934" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"></a><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Specificity : 0.3500<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Pos Pred Value : 0.9287<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Neg Pred Value : 0.6125<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Prevalence : 0.8968<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Detection Rate : 0.8740<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> Detection Prevalence : 0.9410<o:p></o:p></span></div>
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<br /></div>
<br />
<div class="MsoNormal" style="margin: 0in 0in 0.0001pt 0.5in;">
<a href="http://www.blogger.com/blogger.g?blogID=7459259550976670934" imageanchor="1" style="clear: left; float: left; margin-bottom: 1em; margin-right: 1em;"></a><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> 'Positive' Class : no</span></div>
</div>
</div>
<div class="MsoNormal" style="margin-left: .5in;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">So here we have the
desired output. Prediction and Accuracy of model based on which, we can predict
future data. </span><br />
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;"><br /></span>
<b><span style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 115%;">Download </span><a href="http://archive.ics.uci.edu/ml/datasets/Bank+Marketing#" style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 115%;">this dataset</a><span style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 115%;"> or other dataset from <a href="http://archive.ics.uci.edu/ml/datasets.html">here </a>and test the algorithm.</span></b><br />
<span style="font-family: 'Times New Roman', serif; font-size: 12pt; line-height: 115%;"><br /></span>
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; line-height: 115%;">Here you go..!!!!<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-left: .5in;">
<br /></div>
</div>
</div>
</div>
<div class="MsoNormal">
</div>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com4tag:blogger.com,1999:blog-7459259550976670934.post-13315020597176985662013-09-14T00:25:00.001+05:302013-09-14T00:28:09.077+05:30Twitter Storm : Real-time Hadoop<div dir="ltr" style="text-align: left;" trbidi="on">
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">All of us might have run Hadoop
batch jobs.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Now the next phase of revolution is
here : Real-time data analysis<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">So Here comes the</span><span style="font-family: 'Times New Roman', serif; font-size: 13.5pt;"> </span><b><span style="font-family: 'Times New Roman', serif; font-size: 13.5pt;">Twitter Storm : A distributed, fault-tolerant,
real-time computation system <o:p></o:p></span></b></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<i><span style="background-color: #f8f8f8; background-position: initial initial; background-repeat: initial initial; font-family: Cambria, serif; font-size: 12pt;">Storm is a free and open source distributed real-time
computation system. Storm makes it easy to reliably process unbounded streams
of data, doing for real-time processing what Hadoop did for batch processing.
Storm is simple, can be used with any programming language, and is a lot of fun
to use!</span><o:p></o:p></i></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<i><span style="background-color: #f8f8f8; background-position: initial initial; background-repeat: initial initial; font-family: Cambria, serif; font-size: 12pt;">Storm has many use cases: <b>realtime analytics, online machine
learning, continuous computation, distributed RPC, ETL</b>, and more. Storm is
fast: a benchmark clocked it at <b>over a
million tuples processed per second per node</b>. It is <b>scalable, fault-tolerant</b>, guarantees your data will be processed,
and is easy to set up and operate.</span><o:p></o:p></i></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<i><span style="background-color: #f8f8f8; background-position: initial initial; background-repeat: initial initial; font-family: Cambria, serif; font-size: 12pt;">(taken from <a href="http://storm-project.net/">http://storm-project.net/</a>)<o:p></o:p></span></i></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="background-color: white;"><span style="font-size: x-large;"><b><span style="font-family: 'Times New Roman', serif;">Storm
Vs. Traditional Hadoop Batch jobs :</span></b><span style="font-family: 'Times New Roman', serif;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Hadoop is fundamentally a batch
processing system. Data is introduced into HDFS , processed by the nodes and
once process is complete, resulting data is back to HDFS. But the problem was
how to perform the realtime data processing. Storm came into the picture to
solve this problem.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Storm makes it easy to process
unbound stream of data with real-time processing. It process data into
topologies and continues the processing data as it arrives.<o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 13.5pt;"><br /></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="background-color: white;"><span style="font-size: x-large;"><b><span style="font-family: 'Times New Roman', serif;">Storm
Components :</span></b><span style="font-family: 'Times New Roman', serif;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
</div>
<ul style="text-align: left;">
<li><b style="text-indent: -0.25in;"><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Topology</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;"> : As on
Hadoop, you run "Map-Reduce jobs", on Storm, you will run
'Topologies'. Key difference between both is : MapReduce job eventually
finished, whereas a topology runs forever(until you kill it</span></li>
<li><b style="text-indent: -0.25in;"><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Nimbus</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;"> : master node runs a daemon called
"Nimbus" that is similar to Hadoop's "JobTracker". Nimbus
is responsible for distributing code around the cluster, assigning tasks to
machines, and monitoring for failures</span><span style="background-color: white; font-family: Helvetica, sans-serif; font-size: 11.5pt; text-indent: -0.25in;">.</span></li>
<li><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Supervisor</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;"> : Each worker node runs a daemon called the
"Supervisor". The supervisor listens for work assigned to its
machine and starts and stops worker processes as necessary based on what
Nimbus has assigned to it. Each worker process executes a subset of a
topology; a running topology consists of many worker processes spread
across many machines.</span></li>
<li><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Stream
</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">: A stream is an unbounded
sequence of tuples. Storm processes transforming a stream into a new stream
in a distributed and reliable way. For example, transforming tweets stream into a stream of trending
topics.</span></li>
<li><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Spout</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">: It’s a source of streams. It reads tuple from
external source and emits those as stream in the topology.</span></li>
<li><b><span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Bolt
</span></b><span style="font-family: 'Times New Roman', serif; font-size: 12pt;">: It consumes input streams,
does some processing, and possibly emits new streams. Complex stream
transformations, like computing a stream of trending topics from a stream
of tweets, require multiple steps and thus multiple bolts. Bolts can do
anything from run functions, filter tuples, do streaming aggregations, do
streaming joins, talk to databases etc.</span></li>
</ul>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQ3VLa6PG7MpVwT4IERJ96QI_dQuNbgIWmKYo5Cf86haMZ9qXEkN0WZg3_i7VWdp_5392vc_K6Tm-DcFOrD9XEqVopWRt5P3TBci0ShUOUCB6gSmR6PEM8eZAf04upc6GiOehLX07jm_YG/s1600/topology.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="color: black;"><img border="0" height="246" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgQ3VLa6PG7MpVwT4IERJ96QI_dQuNbgIWmKYo5Cf86haMZ9qXEkN0WZg3_i7VWdp_5392vc_K6Tm-DcFOrD9XEqVopWRt5P3TBci0ShUOUCB6gSmR6PEM8eZAf04upc6GiOehLX07jm_YG/s320/topology.png" width="320" /></span></a></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Zookeeper cluster works as coordinator
between Nimbus and supervisors. Nimbus and supervisor are stateless and fail-fast.
All states are kept in Zookeeper or local disk.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhng-MeIcqXiwRvSI6QUopVSA2OqiGzSZKItNG6cwE70f7A9-6X6yHPSgw58fRK6lR1JFCfl2O9sb8oVAIoIoF091nIXKHM0jCsvVNqi2D0iHwrHBaBJytT6lXAKaP_iLgQlFRjVlK99-dw/s1600/storm-cluster.png" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEhng-MeIcqXiwRvSI6QUopVSA2OqiGzSZKItNG6cwE70f7A9-6X6yHPSgw58fRK6lR1JFCfl2O9sb8oVAIoIoF091nIXKHM0jCsvVNqi2D0iHwrHBaBJytT6lXAKaP_iLgQlFRjVlK99-dw/s1600/storm-cluster.png" /></a></div>
<o:p></o:p><br />
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<br /></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="background-color: white;"><b><span style="font-family: 'Times New Roman', serif;"><span style="font-size: x-large;">Why should anyone use Storm:</span></span></b></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;"><br /></span></div>
<div class="MsoNormal" style="margin-bottom: 0.0001pt;">
<span style="font-family: 'Times New Roman', serif; font-size: 12pt;">Now we have the clear idea of storm
in real-time processing. As Hadoop Map-reduce eases the batch processing, in
the same way Storm eases the parallel real-time computation.</span></div>
<div class="MsoNormal">
<span style="font-family: "Times New Roman","serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Following are some key point, that
shows the importance of Storm.</span></div>
<div class="MsoNormal">
</div>
<ul style="text-align: left;">
<li><span style="font-family: Symbol; font-size: 12pt; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt;"> </span></span><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;"><b>Scalable </b>: It scales massive numbers
of messages per second.</span><span style="background-color: white; font-family: Helvetica, sans-serif; font-size: 11.5pt; text-indent: -0.25in;"> </span><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;">To
scale a topology, all you have to do is add machines and increase the
parallelism settings of the topology. As an example, one of Storm's initial
applications processed 1,000,000 messages per second on a 10 node cluster,
including hundreds of database calls per second as part of the topology.
Storm's usage of Zookeeper for cluster coordination makes it scale to much
larger cluster sizes.</span></li>
<li><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;"><b>Guarantees no loss of data</b> : Storm
guarantees that every message will be processed.</span></li>
<li><span style="font-family: Symbol; font-size: 12pt; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt;"> </span></span><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;"><b>Robust</b>: Goal of Storm is to make
user painless for Storm cluster management unlike hadoop clusters.</span></li>
<li><span style="font-family: Symbol; font-size: 12pt; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt;"> </span></span><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;"><b>Fault-tolerent </b>: If any fault occurs
during computation, Storm reassigns tasks.</span></li>
<li><span style="font-family: Symbol; font-size: 12pt; text-indent: -0.25in;"><span style="font-family: 'Times New Roman'; font-size: 7pt;"><b> </b></span></span><span style="font-family: 'Times New Roman', serif; font-size: 12pt; text-indent: -0.25in;"><b>Broad set of use cases</b>: Storm can be
used for stream processing, database updation, doing continous query on data streams
and streaming results into client(continous computation), Distributed RPC.</span></li>
</ul>
</div>
Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com0tag:blogger.com,1999:blog-7459259550976670934.post-81371046550036850962012-04-12T00:00:00.001+05:302012-04-12T11:32:13.874+05:30How to get rid of "Commands out of sync, you can't run the command now" error in mysql php<div dir="ltr" style="text-align: left;" trbidi="on">
<br />
<div style="background-attachment: initial; background-clip: initial; background-color: white; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; clear: both; font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px; margin-bottom: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; word-wrap: break-word;">
I wrote a stored procedure for a table, and after that i executed queries for same table in one php function, But i am getting error :</div>
<pre class="default prettyprint" style="background-attachment: initial; background-clip: initial; background-color: #eeeeee; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; font-size: 14px; line-height: 18px; margin-bottom: 10px; max-height: 600px; overflow-x: auto; overflow-y: auto; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; padding-top: 5px; vertical-align: baseline; width: auto;"><code style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="typ" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #2b91af; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">Error</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="kwd" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: darkblue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">in</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> db </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">:</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="typ" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: #2b91af; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">Commands</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="kwd" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: darkblue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">out</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> of sync</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">,</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> you can</span><span class="str" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: maroon; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">'t run the command now..</span></code></pre>
<div style="background-attachment: initial; background-clip: initial; background-color: white; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; clear: both; font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px; margin-bottom: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; word-wrap: break-word;">
I tried mysqli: multi_query also instead of mysqli:query, but i got null output. </div>
<div style="background-attachment: initial; background-clip: initial; background-color: white; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; clear: both; font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px; margin-bottom: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; word-wrap: break-word;">
So after some effort, i found the solution.</div>
<div style="background-attachment: initial; background-clip: initial; background-color: white; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; clear: both; font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px; margin-bottom: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; word-wrap: break-word;">
<span style="color: #434343; font-family: helvetica, arial, verdana, 'ms sans serif', sans-serif; line-height: 22px;">"Commands out of sync", is caused by unused result sets left over by the db procedure. When you call your first procedure, the result sets are buffered until you use them. </span></div>
<div style="background-attachment: initial; background-clip: initial; background-color: white; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; clear: both; font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px; margin-bottom: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; word-wrap: break-word;">
<span style="color: #434343; font-family: helvetica, arial, verdana, 'ms sans serif', sans-serif; line-height: 22px;">In that case, it is required to free the buffered result set as mysql uses unbuffered queries by default.</span></div>
<div style="background-attachment: initial; background-clip: initial; background-color: white; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; clear: both; font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px; margin-bottom: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; word-wrap: break-word;">
<span style="color: #434343; font-family: helvetica, arial, verdana, 'ms sans serif', sans-serif; line-height: 22px;">So use following code to overcome this error :</span></div>
<div style="background-attachment: initial; background-clip: initial; background-color: white; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; clear: both; font-family: Arial, 'Liberation Sans', 'DejaVu Sans', sans-serif; font-size: 14px; line-height: 18px; margin-bottom: 1em; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline; word-wrap: break-word;">
</div>
<pre class="default prettyprint" style="background-attachment: initial; background-clip: initial; background-color: #eeeeee; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; margin-bottom: 10px; max-height: 600px; overflow-x: auto; overflow-y: auto; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; padding-top: 5px; vertical-align: baseline; width: auto;"><code style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$sql</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">=</span><span class="str" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: maroon; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">""</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">;</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="kwd" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: darkblue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">if</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">mysqli_multi_query</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$link</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">,</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> $sql</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">))</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">{</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="kwd" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: darkblue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">do</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">{</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="kwd" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: darkblue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">if</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$result </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">=</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> mysqli_store_result</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$link</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">))</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">{</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="kwd" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: darkblue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">while</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$row </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">=</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> mysqli_fetch_array</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$result</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">))</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">{</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
array_push</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$arrows</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">,</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$row</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">);</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">}</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
mysqli_free_result</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$result</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">);</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">}</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">}</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="kwd" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; color: darkblue; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">while</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"> </span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">mysqli_next_result</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">(</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">$link</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">));</span><span class="pln" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">
</span><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">}</span></code></pre>
<pre class="default prettyprint" style="background-attachment: initial; background-clip: initial; background-color: #eeeeee; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; margin-bottom: 10px; max-height: 600px; overflow-x: auto; overflow-y: auto; padding-bottom: 5px; padding-left: 5px; padding-right: 5px; padding-top: 5px; vertical-align: baseline; width: auto;"><code style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">Use "</span></code>mysqli_multi_query" i<code style="background-attachment: initial; background-clip: initial; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; font-family: Consolas, Menlo, Monaco, 'Lucida Console', 'Liberation Mono', 'DejaVu Sans Mono', 'Bitstream Vera Sans Mono', 'Courier New', monospace, serif; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;"><span class="pun" style="background-attachment: initial; background-clip: initial; background-color: transparent; background-image: initial; background-origin: initial; border-bottom-width: 0px; border-color: initial; border-image: initial; border-left-width: 0px; border-right-width: 0px; border-style: initial; border-top-width: 0px; margin-bottom: 0px; margin-left: 0px; margin-right: 0px; margin-top: 0px; padding-bottom: 0px; padding-left: 0px; padding-right: 0px; padding-top: 0px; vertical-align: baseline;">nstead of "</span></code>mysqli_query".</pre>
<br /></div>Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com2tag:blogger.com,1999:blog-7459259550976670934.post-76710667779045619562012-04-07T20:41:00.000+05:302012-04-07T21:01:03.292+05:30Building Apache Shindig for PHP<div dir="ltr" style="text-align: left;" trbidi="on">
Apache Shindig is an opensocial container to host Opensocial apps with the help of gadgets and reference implementation of backend API.<br />
<br />
Shindig is available for two language : JAVA and PHP<br />
Here we'll discuss about Building Shindig in PHP.<br />
<ul style="text-align: left;">
<li>Firstly, download Shindig-PHP <span style="font-family: inherit;">version</span> from <a href="http://shindig.apache.org/download/index.html">http://shindig.apache.org/download/index.html</a></li>
<li>Enable Apache mod_rewrite module.</li>
<li style="text-align: left;"><span style="background-color: white; text-align: -webkit-auto;">Enable </span><span style="background-color: white; text-align: -webkit-auto;"><span style="font-family: inherit;">j</span></span><span style="font-family: inherit;"><span style="background-color: white; text-align: -webkit-auto;">son, curl, simplexml and mcrypt</span><span style="background-color: white; text-align: -webkit-auto;"> </span><span style="background-color: white; text-align: -webkit-auto;">extensions in PHP.</span></span></li>
<li style="text-align: left;"><span style="font-family: inherit;"><span style="background-color: white; text-align: -webkit-auto;">Create a directory named "Shindig" in 'www' (in Windows) or var/www/html(in Linux) directory.</span></span></li>
<li style="text-align: -webkit-auto;">Now check <span style="white-space: pre-wrap;"> php/config/container.php</span> file. Change 'web_prefix' constant value to '/shindig/php'.</li>
<li style="text-align: -webkit-auto;">Now you can run following url. </li>
</ul>
<div style="text-align: -webkit-auto;">
<span style="white-space: pre-wrap;"><a href="http://localhost/shindig/php/gadgets/ifr?url=http://www.labpixies.com/campaigns/todo/todo.xml">http://localhost/shindig/php/gadgets/ifr?url=http://www.labpixies.com/campaigns/todo/todo.xml</a></span>
</div>
<div style="text-align: -webkit-auto;">
<br /></div>
<div style="text-align: -webkit-auto;">
For more information, please visit : <a href="http://shindig.apache.org/" style="text-align: left;">http://shindig.apache.org/</a></div>
</div>Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com1NOIDA, Uttar Pradesh, India28.5355161 77.391026528.423918099999998 77.233098 28.6471141 77.548954999999992tag:blogger.com,1999:blog-7459259550976670934.post-48128832548526234362011-04-16T00:53:00.003+05:302012-05-02T18:06:49.992+05:30Installing PHPUnit On Windows<div dir="ltr" style="text-align: left;" trbidi="on">
<span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;">This section includes the installation of PHPUnit on windows. PHPUnit is testing framework for PHP. So lets start with the procedure :</span><br />
<ul style="text-align: left;">
<li><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US"><span style="font-family: 'Times New Roman'; font-size: 7pt; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"> </span></span></span><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;">Download and Install</span></span><span class="apple-converted-space"><span lang="EN-US" style="color: #333333; line-height: 115%;"> </span></span><span lang="EN-US"><a href="http://www.wampserver.com/en/download.php" target="_blank" title="Download WAMP"><span style="color: #59708c; line-height: 115%;">WAMP</span></a></span><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;">.</span></span></span></li>
<li><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;"></span></span><span class="Apple-style-span" style="color: #333333; line-height: 14px;"><span class="apple-style-span"><span lang="EN-US"><span style="font-family: 'Times New Roman'; font-size: 7pt; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"> </span></span></span><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;">Once you have installed WAMP on your machine, open up the command prompt and go to your php directory in WAMP. Suppose Wamp is installed in C drive. </span></span></span></span></li>
</ul>
<div>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="line-height: 14px;"><span class="Apple-style-span" style="color: black; line-height: normal;"><b><span lang="EN-US" style="color: #333333; line-height: 14px;"> C:\> cd wamp\bin\php\php5.2.8</span></b></span></span></span></div>
<ul style="text-align: left;">
<li><span class="Apple-style-span" style="color: #333333;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span" style="line-height: 115%;"><span lang="EN-US" style="color: #333333; line-height: 115%;">From here you must run the go-pear.bat file to install</span></span><span class="apple-converted-space" style="line-height: 115%;"><span lang="EN-US" style="color: #333333; line-height: 115%;"> </span></span><span lang="EN-US" style="line-height: 115%;"><a href="http://pear.php.net/" target="_blank" title="PEAR"><span style="color: #59708c; line-height: 115%;">PEAR</span></a></span><span class="apple-converted-space" style="line-height: 115%;"><span lang="EN-US" style="color: #333333; line-height: 115%;"> </span></span><span class="apple-style-span" style="line-height: 115%;"><span lang="EN-US" style="color: #333333; line-height: 115%;">and all the files needed for it.</span></span></span></span></li>
</ul>
<div class="MsoListParagraph" style="color: #333333;">
<span class="apple-style-span"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"> C:\wamp\bin\php\php5.2.8>go-pear.bat<o:p></o:p></span></span></b></span></div>
<ul style="text-align: left;">
<li><span class="Apple-style-span" style="color: #333333;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333;"><span style="font-family: 'Times New Roman'; font-size: 7pt; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"><span class="Apple-style-span" style="line-height: 115%;"><span style="font-weight: bold;"><span class="Apple-style-span" style="font-weight: normal;"> </span></span></span></span></span></span><span class="apple-style-span" style="line-height: 14px;"><span lang="EN-US" style="color: #333333; line-height: 115%;">When executing above command, it will ask you a series of question to set itself up correctly, if you don’t know what to do, then just accept the defaults. But if you are using more than one version of PHP, then select “<b>local</b>” in [system/local] wide copy configuration.</span></span></span></span></li>
</ul>
<div class="MsoListParagraph" style="text-indent: -18pt;">
</div>
<ul style="text-align: left;">
<li><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;"><span style="font-family: 'Times New Roman'; font-size: 7pt; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"> </span></span></span><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;">It’ll install pear 1.7.2. If you want to install PHPUnit 3.5, that package is available with PEAR 1.9.1 only. So upgrade it to higher version. </span></span></span><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="color: #333333; line-height: 14px;">To upgrade any pear, Use following command</span><span class="Apple-style-span" style="color: #333333; line-height: 14px;"> :</span></span></li>
</ul>
<div>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"></span><br />
<div class="MsoListParagraph" style="line-height: 14px;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"> </span></b></span><b><span lang="EN-US" style="color: #333333; line-height: 115%;"> C:\wamp\bin\php\php5.2.8>pear upgrade pear</span></b></span></div>
<div class="MsoListParagraph" style="line-height: 14px;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><b><br />
</b></span></div>
<div class="MsoListParagraph" style="line-height: 14px;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;"></span></span></span></div>
<div class="MsoListParagraphCxSpFirst">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;"> </span><span lang="EN-US" style="color: #333333; line-height: 115%;">You can check the information of pear from any of the command.<b><o:p></o:p></b></span></span></span></div>
<div class="MsoListParagraphCxSpMiddle" style="font-weight: bold;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"><br />
</span></b></span></span></div>
<div class="MsoListParagraphCxSpMiddle" style="font-weight: bold;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"> C:\wamp\bin\php\php5.2.8>pear info pear<o:p></o:p></span></b></span></span></div>
<div class="MsoListParagraphCxSpMiddle">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;"> </span></span><span class="Apple-style-span" style="font-weight: bold;"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"> C:\wamp\bin\php\php5.2.8>pear -V</span></b></span></span></div>
<div class="MsoListParagraphCxSpMiddle">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="font-weight: bold;"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"><br />
</span></b></span></span></div>
<ul style="text-align: left;">
<li><div class="MsoListParagraph" style="text-indent: -18pt;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333;"><span style="font-family: 'Times New Roman'; font-size: 7pt; font-style: normal; font-variant: normal; font-weight: normal; line-height: normal;"><span class="Apple-style-span" style="line-height: 14px;"><span style="font-weight: bold;"> <span class="Apple-style-span" style="font-weight: normal;"> </span></span></span><span class="Apple-style-span" style="line-height: 115%;"> </span></span></span></span><span class="apple-style-span" style="line-height: 14px;"><span lang="EN-US" style="color: #333333; line-height: 115%;">Once installed you must run the PEAR_ENV.reg which will create the environment variables for the user, so that PEAR can be called in any directory on the command line.</span></span><span class="apple-style-span" style="font-weight: bold; line-height: 14px;"><span lang="EN-US" style="color: #333333; line-height: 115%;"><o:p></o:p></span></span></span></div>
<div class="MsoListParagraph" style="text-indent: -18pt;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span" style="line-height: 14px;"><span lang="EN-US" style="color: #333333; line-height: 115%;"><br />
</span></span></span></div>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span" style="font-weight: bold; line-height: 14px;"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"> C:\wamp\bin\php\php5.2.8>PEAR_ENV.reg </span></b></span></span></li>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"> </span></ul>
</div>
<div>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="line-height: 14px;"> If you cannot call PEAR from the command prompt, then you must manually add </span></span></div>
<div>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="line-height: 14px;"></span></span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;"> directory to </span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;">the path in the System Environment Variables list, </span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;">then add the directory</span></div>
<div>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;"> "</span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;">C:\wamp\bin\php\php5.2.8<b>"</b> to the '<b>PATH' </b>&<b> 'INCLUDE_PATH' </b></span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;">variable. Or double click on the file </span></div>
<div>
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;"> "<b>PEAR_ENV.reg</b>" in PHP folder, it'll register the environment variables in system. </span><br />
<span style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><span style="line-height: 14px;"><br /></span></span></div>
<ul style="text-align: left;">
<li style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 14px;">Once you have PEAR setup, then you must register the</span></span><span class="apple-converted-space"><span lang="EN-US" style="color: #333333; line-height: 14px;"> </span></span><span lang="EN-US"><a href="http://www.phpunit.de/" target="_blank" title="PHPUnit"><span style="color: #59708c; line-height: 14px;">PHPUnit</span></a></span><span class="apple-converted-space"><span lang="EN-US" style="color: #333333; line-height: 14px;"> </span></span><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 14px;">channel with PEAR.</span></span></span></li>
</ul>
<div style="color: #333333; line-height: 14px;">
</div>
<div class="MsoListParagraphCxSpMiddle" style="color: #333333; line-height: 14px; margin-bottom: 0px; margin-left: 18pt; margin-right: 0px; margin-top: 0px;">
<span class="apple-style-span"><b><span lang="EN-US" style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"> C:\wamp\bin\php\php5.2.8>pear channel-discover pear.phpunit.de<o:p></o:p></span></span></b></span></div>
<div class="MsoListParagraphCxSpMiddle" style="color: #333333; line-height: 14px; margin-bottom: 0px; margin-left: 18pt; margin-right: 0px; margin-top: 0px;">
<b><span lang="EN-US" style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"> C:\wamp\bin\php\php5.2.8>pear channel-discover components.ez.no</span></span></b></div>
<div class="MsoListParagraphCxSpMiddle" style="color: #333333; line-height: 14px; margin-bottom: 0px; margin-left: 18pt; margin-right: 0px; margin-top: 0px;">
<b><span lang="EN-US" style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"> C:\wamp\bin\php\php5.2.8>pear channel-discover pear.symfony-project.com</span></span></b></div>
<div class="MsoListParagraphCxSpMiddle" style="color: #333333; line-height: 14px; margin-bottom: 0px; margin-left: 18pt; margin-right: 0px; margin-top: 0px;">
<b><span lang="EN-US" style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><br />
</span></span></b></div>
<ul style="text-align: left;">
<li style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><div class="MsoListParagraph" style="display: inline !important; text-indent: -18pt;">
<span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;">Now you can use the PEAR to install packages from the PHPUnit channel.</span></span></div>
</span></li>
</ul>
<div style="text-indent: -24px;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif;"><b><span class="apple-style-span"><b><span lang="EN-US" style="color: #333333; line-height: 18px;"> </span><span lang="EN-US" style="color: #333333; line-height: 14px;">C:\wamp\bin\php\php5.2.8>pear install --</span></b><b style="line-height: 14px;"><span lang="EN-US" style="color: #333333; line-height: 14px;">alldeps </span><span lang="EN-US" style="color: #333333; line-height: 14px;">phpunit/PHPUnit</span></b></span></b></span></div>
<div style="text-indent: -24px;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;"><br />
</span></div>
<div style="text-indent: -24px;">
<span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;"> You can check the version of PHPUnit with:</span></div>
<div>
<span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="color: #333333; line-height: 14px;"><b><span lang="EN-US" style="color: #333333; line-height: 14px;"> </span></b></span></span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;"><b><span lang="EN-US" style="color: #333333; line-height: 14px;"> C:\wamp\bin\php\php5.2.8>phpunit --version</span></b></span></div>
<div>
<span class="Apple-style-span" style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><br />
</span></span></div>
<div>
<span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="color: #333333; line-height: 14px;"> It’ll download and install PHPUnit package 1.3.1 that is compatible with PHP </span></span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;">4. </span></div>
<div>
<span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="color: #333333; line-height: 14px;"> So </span></span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;">to </span><span class="Apple-style-span" style="color: #333333; font-family: Arial, Helvetica, sans-serif; line-height: 14px;">download package PHPUnit 3.5 we need to install it forcefully using command:</span></div>
<div class="MsoNormal" style="margin-left: 36.0pt; tab-stops: 18.0pt 36.0pt;">
<span class="Apple-style-span" style="color: #333333; line-height: 14px;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><br />
</span></span></div>
<div class="MsoNormal" style="margin-left: 36.0pt; tab-stops: 18.0pt 36.0pt;">
<span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="color: #333333; line-height: 14px;"></span></span></div>
<div class="MsoNormal" style="margin-left: 18.0pt; tab-stops: 18.0pt 36.0pt;">
<b><span lang="EN-US" style="color: #333333; line-height: 115%;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"> C:\wamp\bin\php\php5.2.8>pear install -f phpunit/PHPUnit</span></span></b></div>
<div class="MsoNormal" style="margin-left: 18.0pt; tab-stops: 18.0pt 36.0pt;">
<b><span lang="EN-US" style="color: #333333; line-height: 115%;"><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><br />
</span></span></b></div>
<div class="MsoNormal" style="margin-left: 18.0pt; tab-stops: 18.0pt 36.0pt;">
<b><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span lang="EN-US"></span></span></b></div>
<div class="MsoNormal" style="color: #333333; line-height: 115%;">
<span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span" style="font-weight: bold;"><b><span lang="EN-US" style="color: #333333; line-height: 115%;"> Note: </span></b></span><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;">-f is used for forcefully installation of package<b>. </b></span></span></span></div>
<div class="MsoNormal" style="color: #333333; line-height: 115%;">
<span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="apple-style-span"><span lang="EN-US" style="color: #333333; line-height: 115%;"><b><br />
</b></span></span></span></div>
<ul style="text-align: left;">
<li><span class="Apple-style-span" style="font-family: Arial, Helvetica, sans-serif;"><span class="Apple-style-span" style="color: #333333; line-height: 18px;">Now your test environment is set and you can start writing testcases</span>. </span></li>
<li><span style="font-family: Arial, Helvetica, sans-serif;">If you want to run php or phpunit from any of the location, make sure that "<b>path</b>" and "<b>include_path</b>" environment variables are properly set to php directory.</span></li>
</ul>
<br /></div>Nishu Tayalhttp://www.blogger.com/profile/12557963497953617072noreply@blogger.com18