tag:blogger.com,1999:blog-1385331313775158662024-03-14T02:12:51.058-07:00GT data miningEdith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.comBlogger7125tag:blogger.com,1999:blog-138533131377515866.post-62387685352847085632016-09-25T06:28:00.002-07:002016-09-25T06:55:20.425-07:00A perpetuum mobile of data – the essence of the IT revolution<div class="MsoNormal" style="background: white; margin-bottom: .0001pt; margin: 0in;">
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: .0001pt; margin: 0in;">
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "arial" , "helvetica" , sans-serif;">The essence of the Information
Technology revolution, the engine that propels it, is the reality in todays'
information systems, of data bring about more and more data in a closed self-amplifying
loop: the data invite applications, applications bring users, users attract new
service ideas, new services create more operations & management data, and
so forth. <o:p></o:p></span></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: .0001pt; margin: 0in;">
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 14.6667px;"><b>Data is the raw materials of the information industry</b>, understanding that makes one appreciate the huge opportunity of free materials that this industry enjoys (remark: of organization information infrastructure does cost, but it is regarded as a general investment or overhead). </span><span style="font-size: 11pt;"> </span><span dir="RTL" lang="HE" style="font-size: 11pt;"><o:p></o:p></span></span></span></div>
<div class="MsoNormal" style="background: white; margin-bottom: .0001pt; margin: 0in;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: .0001pt; margin: 0in;">
<span style="color: #222222; font-family: "arial" , "helvetica" , sans-serif;"><span style="font-size: 14.6667px;">The problem with the virtual data assets is that most of them are intangible, i.e. do not have specific registered value in the accountancy. Hence managements may miss their existence. As long as the organization's competitors are sleepy, the waste does not really hurt and usually goes unnoticed by managements. But, the minute somebody else in the branch is starting to use information for strategic advantage, the rules of game are changing, forever.</span></span><br />
<br /></div>
<div class="MsoNormal" style="background: white; margin-bottom: .0001pt; margin: 0in;">
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "arial" , "helvetica" , sans-serif;">For example, the story of the meteoric
rise of Netflix to world leadership in movies supply by the web. Prior to the foundation of Netflix
on 1997, the market was dominated by Blockbuster that was not inclined to <span style="font-family: Arial, sans-serif; font-size: 11pt;">adopt advanced</span> technologies, in contrary to Netflix that was quick to employ
new techniques and data from operations, for their "agile" business
development. Blockbuster simply stayed behind. They did not have much chance to
close the widening gap. It does not help in this case, even if a company is
big, strong, reputable and internationally spread as Blockbuster was.</span><o:p></o:p></span><br />
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "arial" , "helvetica" , sans-serif;"><br /></span></span>
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "arial" , "helvetica" , sans-serif;">Edith Ohri</span></span><br />
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "georgia" , "times new roman" , serif;"><i>Home of GT data mining </i></span></span><br />
<span style="color: #222222; font-size: 11.0pt;"><span style="font-family: "trebuchet ms" , sans-serif;"><i>Sep.2016</i></span></span></div>
Edith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.com0tag:blogger.com,1999:blog-138533131377515866.post-87034360089204036552016-09-23T03:44:00.002-07:002016-09-28T01:40:11.514-07:00Is Machine Learning chasing its own tail (of presumptions)?<div class="MsoNormal" style="background: white; direction: ltr; unicode-bidi: embed;">
<b><span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Machine Learning</span></b><span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"> (ML) as a method of learning is indeed a
machine, i.e. it operates consistently, repeatedly and predictably, by a
designed method, which is made for specific conditions; but its "learning"
part is more like "training" or "verification" rather than
the acquisition of new knowledge that is suggested in this name. Practically
speaking, ML is made to improve prescribed response formulas, not to invent
such formulas, (and I know the statement might be seemed controversial) not
even to correct them.<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; direction: ltr; unicode-bidi: embed;">
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"><br /></span></div>
<div class="MsoNormal" style="background: white; direction: ltr; unicode-bidi: embed;">
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Here is then my take
on the issue:<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; direction: ltr; unicode-bidi: embed;">
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"><br /></span></div>
<div class="MsoNormal" style="background: white; direction: ltr; unicode-bidi: embed;">
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Law #1 A dog (or a
cat) chasing its tail for long enough time will eventually catch it.<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; direction: ltr; unicode-bidi: embed;">
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";"><br /></span></div>
<div class="MsoNormal" style="background: white; direction: ltr; unicode-bidi: embed;">
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-fareast-font-family: "Times New Roman";">Law #2 The catching
will heart!<o:p></o:p></span></div>
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-ansi-language: EN-US; mso-bidi-language: HE; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: EN-US;"><br /></span>
<span style="color: #111111; font-family: "Arial","sans-serif"; font-size: 12.0pt; mso-ansi-language: EN-US; mso-bidi-language: HE; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: EN-US;">Law #3 Getting painful results will not stop the
chase; it will stop only due to boredom or the exhaustion of all energy
resources. </span><br />
<br />
<br />
<br />Edith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.com0tag:blogger.com,1999:blog-138533131377515866.post-1337020852653894382016-09-22T06:52:00.000-07:002016-09-22T06:52:48.510-07:00The law of Large Numbers fails in big data<div class="MsoNormal" style="margin-bottom: .0001pt; margin: 0in;">
<span style="font-family: "Courier New"; font-size: 13.5pt;">The law of Large
Numbers is often regarded as a sort of "law of nature" by which
variables' averages always gravitate to fixed clear values. </span><span style="font-family: "Times New Roman", serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin: 0in;">
<span style="font-family: "Courier New"; font-size: 13.5pt;">The question is,
does the law of large number hold true in the case of big-data?</span><span style="font-family: "Times New Roman", serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin: 0in;">
<span style="font-family: "Courier New"; font-size: 13.5pt;">The key to the
answer is in the law's underlying assumptions regarding sample-representation
and data stability. One of the qualities that signify big data is Volatility. Volatility
thrives in large multi-variant and closely-packed interrelated events that
usually exist in big data, and it is the dynamics that follows which interferes
in the convergence of averages and prevents it from happening. </span><span style="font-family: "Times New Roman", serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoPlainText">
</div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin: 0in;">
<span style="font-family: "Courier New"; font-size: 13.5pt;">In my view, even
if the law of large numbers was true for big data, it would not have been of
much use, due to its focus on common "average" behavior that is
already known, rather than on irregularities and exceptions that are yet
unknown and requiring research, such as in the study of early-detection
indicators, adverse-effects, fraud-detection, quality-assurance, customer-retention,
accidents, and long-tail marketing – to mention a few. Long-tails for example, consist
of overlooked hidden phenomena, thus their discovery has to look, by definition,
elsewhere then the already considered law of large numbers.</span><span style="font-family: "Times New Roman", serif; font-size: 13.5pt;"><o:p></o:p></span></div>
<div class="MsoPlainText">
<br /></div>
<div class="MsoPlainText">
<span style="font-family: "Courier New";">The above weak
points of the law of large numbers, are just a small part of analytics
"peculiarities" that can be expected in big data. <o:p></o:p></span></div>
<div class="MsoPlainText">
<span style="font-family: "Courier New";">This paragraph is
the first in series of assays on a proposed new concept of science in view of
the IT industrial revolution.<o:p></o:p></span></div>
<br />
<div class="MsoPlainText">
<br /></div>
Edith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.com0tag:blogger.com,1999:blog-138533131377515866.post-65410717825791040212015-09-27T06:04:00.002-07:002015-09-28T05:25:40.508-07:00GT data mining demonstration - finances<h1>
Prediction of the daily US $ up/down change<o:p></o:p></h1>
<div>
<br /></div>
<div class="MsoNormal" style="margin-left: 18.0pt;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq2XilyT04_DxE2R8-ItzJLjwDQF1eyUsaB78MYtbUSugwyrFlPQUykKgmleq1UeOGPbYADmbuYt7nmojVI43cOAM5o_oFHuTexAW5tzzA6aU_PMdJwppChEc1j-FDM48B-0cnu6gdlgWH/s1600/Target+pic+-+presentations.jpg" imageanchor="1" style="clear: right; float: right; margin-bottom: 1em; margin-left: 1em;"><img alt="" border="0" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEjq2XilyT04_DxE2R8-ItzJLjwDQF1eyUsaB78MYtbUSugwyrFlPQUykKgmleq1UeOGPbYADmbuYt7nmojVI43cOAM5o_oFHuTexAW5tzzA6aU_PMdJwppChEc1j-FDM48B-0cnu6gdlgWH/s1600/Target+pic+-+presentations.jpg" title="Targer" /></a><br />
<div class="MsoNormal" style="background: white; margin-left: -18.0pt;">
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-fareast-font-family: "Times New Roman";">Edith Ohri, edith@fabhighq.com<o:p></o:p></span></div>
</div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<br />
<o:p></o:p></div>
<h2>
The goal<o:p></o:p></h2>
<div>
<br /></div>
<div class="MsoNormal">
<div class="MsoNormal">
<div class="MsoNormal" style="background: white;">
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-fareast-font-family: "Times New Roman";">In this demo the goal has been to predict with 55% accuracy, the next
day's $ direction (if it is going to be UP or DOWN).<u1:p></u1:p></span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white;">
<br /></div>
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-ansi-language: EN-US; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: HE; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: EN-US;">To attain this, one is required to establish objective rules and formulas that are </span><span style="color: #666666; font-family: 'Trebuchet MS', sans-serif; font-size: 14.6667px;">independent of the specific input</span><span style="color: #666666; font-family: 'Trebuchet MS', sans-serif; font-size: 11pt;">.</span><br />
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-ansi-language: EN-US; mso-bidi-font-family: "Times New Roman"; mso-bidi-language: HE; mso-fareast-font-family: "Times New Roman"; mso-fareast-language: EN-US;">
<!--[if !supportLineBreakNewLine]--><br />
<!--[endif]--></span></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;"> <o:p></o:p></span></div>
</div>
<div align="right" class="MsoNormal" style="text-align: right;">
<br /></div>
<h2>
How it works<o:p></o:p></h2>
<div class="MsoNormal" style="margin-left: 36.0pt; mso-list: l3 level1 lfo1; text-indent: -18.0pt;">
<br />
<div class="MsoNormal">
<span style="font-size: 11.0pt;">a - Deciding on
input (from already existing available sources)<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;">b - Defining with
GT the patterns of behavior (= groups) and cause-effect formulas.<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;">c - The
above consist an expert system that is used then for early alerts and real time
decisions.<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;">d - The expert
system can improve itself and periodic reviews it rules.<o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<span style="font-size: 11.0pt;">Note: GT's formulas can be
integrated in the control of almost any product.<o:p></o:p></span></div>
</div>
<div class="MsoNormal">
<br />
<br /></div>
<h2>
The data set<o:p></o:p></h2>
<div class="MsoNormal">
<div class="MsoNormal">
<span style="font-size: 11.0pt;">The data include <u>760</u> daily
records over two and a half years period, and 7 variables: Date, Open price,
Close price, High, Low, and an index named RSI (Relative Strength Indicator -
it compares the magnitude of recent gains to recent losses in an attempt to
determine overbought and oversold. When it goes above 70 or below 30, it
indicates that a stock is overbought or oversold and vulnerable to a
trend reversal)<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;">Rem.: Trade Volume
information could not be attained in this demo.<o:p></o:p></span></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;">On top of the 7 basic
variables, another 30 or more calculated variables were added, such as Trends,
Week Days etc.<o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<span style="font-size: 11.0pt;">The Test set includes <u>122</u> records
from the <u>end</u> of the period.<o:p></o:p></span></div>
</div>
<div class="MsoNormal">
<br /></div>
<br />
<div class="separator" style="clear: both; text-align: center;">
<a href="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigkc3wyXX0D-BozAJcp6zSt9w1_6wc_PwrhWR632JFecnAe0W1IatyVW5pKW4h2g5BVKHUP7LBWC8LeTikqrAdD7pAksU2DlWttd1y-khM262dKAfF4JBMFvctj1r_g8p88V2EczUkuOo_/s1600/Input+data+for+daily+prediction+of+the+US+%2524+increase-decrease+demo+-+financial.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><span style="background-color: white; color: white;"><img border="0" height="403" src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEigkc3wyXX0D-BozAJcp6zSt9w1_6wc_PwrhWR632JFecnAe0W1IatyVW5pKW4h2g5BVKHUP7LBWC8LeTikqrAdD7pAksU2DlWttd1y-khM262dKAfF4JBMFvctj1r_g8p88V2EczUkuOo_/s640/Input+data+for+daily+prediction+of+the+US+%2524+increase-decrease+demo+-+financial.jpg" width="640" /></span></a></div>
<div align="center" class="MsoCaption" style="text-align: center;">
Figure <!--[if supportFields]><span
style='mso-element:field-begin'></span><span
style='mso-spacerun:yes'> </span>SEQ Figure \* ARABIC <span style='mso-element:
field-separator'></span><![endif]-->1<!--[if supportFields]><span
style='mso-element:field-end'></span><![endif]--> Input records<o:p></o:p></div>
<div class="MsoNormal">
<br />
<br /></div>
<h2>
The GT Learning results<o:p></o:p></h2>
<div class="MsoNormal">
<div class="MsoNormal">
<span style="font-size: 11.0pt;">First thing is creating a
lower hurdle, which is "the best results that can one can achieve without
the GT algorithm.<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;">Here the lower hurdle was
55.7% right predictions in the Test set, and 56.6% in the Learning set.<o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<span style="font-size: 11.0pt;">Rem.: the good results are
credited to the discovery of typical Weekdays' Close price changes.<o:p></o:p></span></div>
<div class="MsoNormal">
<span style="font-size: 11.0pt;"><br /></span></div>
</div>
<h2>
Reaching beyond the assigned target<o:p></o:p></h2>
<div class="MsoNormal">
<div class="MsoNormal">
<span style="font-size: 11.0pt;">The assigned target of 55%
prediction success was achieved, but it can be further improved with the GT
Patterns-of-Behavior definition.<o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<span style="font-size: 11.0pt;">It is well known low (and
quite intuitive one) which says that a greater precision can be always attained
by adjusting the prediction factors to the subgroups of a given dataset.
Following is a short demonstration of this low, by employing the special
abilities of GT algorithm.<o:p></o:p></span></div>
</div>
<div class="MsoNormal">
<br /></div>
<div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm; margin-left: 36.0pt; margin-right: 0cm; margin-top: 0cm; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt; text-indent: -18.0pt;">
<div class="MsoNormal">
<div class="MsoNormal" style="background: white; tab-stops: list 36.0pt; text-indent: -18.0pt;">
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";"><br /></span></div>
</div>
</div>
</div>
<h2>
GT Results* <o:p></o:p></h2>
<div class="MsoNormal" style="mso-pagination: widow-orphan lines-together; page-break-after: avoid;">
<div class="MsoNormal">
<span style="font-size: 11.0pt;">(* Initial results, for this
demonstration)<o:p></o:p></span></div>
<br /></div>
<div align="center">
<table border="1" cellpadding="0" cellspacing="0" class="MsoNormalTable" style="background: #F1F2F5; border-collapse: collapse; border: none; mso-border-alt: solid white 1.0pt; mso-border-insideh: 1.0pt solid white; mso-border-insidev: 1.0pt solid white; mso-padding-alt: 0cm 0cm 0cm 0cm; mso-yfti-tbllook: 1536; width: 392px;">
<tbody>
<tr style="height: 48.0pt; mso-yfti-firstrow: yes; mso-yfti-irow: 0;">
<td style="border: solid white 1.0pt; height: 48.0pt; padding: .75pt .75pt 0cm .75pt; width: 294.0pt;" width="392"><div align="center" class="MsoNormal" style="mso-pagination: widow-orphan lines-together; page-break-after: avoid; text-align: center; vertical-align: bottom;">
<span style="color: blue; font-size: large;">Count of true/false predictions<span style="font-size: 18.0pt; mso-fareast-font-family: "Times New Roman";">:<o:p></o:p></span></span></div>
</td>
</tr>
<tr style="height: 48.0pt; mso-yfti-irow: 1;">
<td style="border-top: none; border: solid white 1.0pt; height: 48.0pt; mso-border-top-alt: solid white 1.0pt; padding: .75pt .75pt 0cm .75pt; width: 294.0pt;" width="392"><div align="center" class="MsoNormal" style="mso-pagination: widow-orphan lines-together; page-break-after: avoid; text-align: center; vertical-align: bottom;">
<span style="color: blue; font-size: large;"><b><span style="color: red; font-size: 18.0pt; mso-fareast-font-family: "Times New Roman"; mso-font-kerning: 12.0pt;">Right
-<span dir="RTL"></span><span dir="RTL"></span><span dir="RTL" lang="HE"><span dir="RTL"></span><span dir="RTL"></span> </span><span dir="LTR"></span><span dir="LTR"></span><span dir="LTR"></span><span dir="LTR"></span>59%</span></b><o:p></o:p></span></div>
</td>
</tr>
<tr style="height: 48.35pt; mso-yfti-irow: 2; mso-yfti-lastrow: yes;">
<td style="border-top: none; border: solid white 1.0pt; height: 48.35pt; mso-border-top-alt: solid white 1.0pt; padding: .75pt .75pt 0cm .75pt; width: 294.0pt;" width="392"><div align="center" class="MsoNormal" style="mso-pagination: widow-orphan lines-together; page-break-after: avoid; text-align: center; vertical-align: bottom;">
<span style="color: blue; font-size: large;">Wrong
-<span dir="RTL"></span><span dir="RTL"></span><span dir="RTL"><span dir="RTL"></span><span dir="RTL"></span> </span><span dir="LTR"></span><span dir="LTR"></span><span dir="LTR"></span><span dir="LTR"></span>41%</span><span style="font-size: 18.0pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></div>
</td>
</tr>
</tbody></table>
</div>
<div class="MsoNormal" style="margin-left: 36.0pt; mso-list: l1 level1 lfo3; mso-pagination: widow-orphan lines-together; page-break-after: avoid; text-indent: -18.0pt;">
<div>
<br /></div>
<div>
<div class="MsoListParagraphCxSpFirst" style="mso-list: l0 level1 lfo1; text-indent: -18.0pt;">
</div>
<br />
<ul>
<ul type="disc">
<li class="MsoNormal" style="background: white; color: #666666; mso-list: l0 level1 lfo1; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; mso-pagination: widow-orphan lines-together; page-break-after: avoid; tab-stops: list 36.0pt;"><span style="font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-fareast-font-family: "Times New Roman";">A 3% rate of improvement in
right prediction was achieved in just the beginning of the GT process.</span><span style="font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></li>
<li class="MsoNormal" style="background: white; color: #666666; mso-list: l0 level1 lfo1; mso-margin-bottom-alt: auto; mso-margin-top-alt: auto; mso-pagination: widow-orphan lines-together; page-break-after: avoid; tab-stops: list 36.0pt;"><span style="font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-fareast-font-family: "Times New Roman";">In a full data mining and input
that includes detailed transactions, further significant improvement can
be expected.</span><span style="font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span></li>
</ul>
</ul>
<div>
<span style="color: #666666; font-family: Trebuchet MS, sans-serif;"><span style="font-size: 14.6667px;"><br /></span></span></div>
</div>
<div>
<h2>
Improvement tips<o:p></o:p></h2>
<div>
<div class="MsoNormal" style="margin-bottom: .0001pt; margin-bottom: 0cm; margin-left: 36.0pt; margin-right: 0cm; margin-top: 0cm; mso-list: l0 level1 lfo1; tab-stops: list 36.0pt; text-indent: -18.0pt;">
<div class="MsoNormal">
<b><span style="font-size: 11pt;"> </span></b><b><span style="font-size: 11pt;"> </span></b><b style="text-indent: -18pt;"> </b><br />
<div class="MsoNormal" style="background: white; margin-left: 18.0pt; mso-list: l1 level1 lfo1; tab-stops: list -36.0pt 36.0pt; text-indent: -18.0pt;">
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-bidi-font-family: "Trebuchet MS"; mso-fareast-font-family: "Trebuchet MS";">1.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span dir="LTR"></span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";">Include non-linear variables if there are, for example "RSI" – a non linear Relative Strength index, that describes the pressure on prices due to excess Demand or Supply.</span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";"><u1:p></u1:p><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; tab-stops: list 36.0pt; text-indent: -18.0pt;">
<br /></div>
<div class="MsoNormal" style="background: white; margin-left: 18.0pt; mso-list: l0 level1 lfo2; tab-stops: list -18.0pt 36.0pt; text-indent: -18.0pt;">
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-bidi-font-family: "Trebuchet MS"; mso-fareast-font-family: "Trebuchet MS";">2.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span dir="LTR"></span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";">Split the data to hierarchical patterns of behavior.<br /> </span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";"><u1:p></u1:p><o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; margin-left: 18.0pt; mso-list: l0 level1 lfo2; tab-stops: list 0cm 36.0pt; text-indent: -18.0pt;">
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-bidi-font-family: "Trebuchet MS"; mso-fareast-font-family: "Trebuchet MS";">3.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;"> </span></span><span dir="LTR"></span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 11.0pt; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";">Avoid "overfitting" by assuming new subsets of data once exhausting their information.</span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 13.5pt; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";"> <u1:p></u1:p></span><span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 10.0pt; mso-fareast-font-family: "Times New Roman";"><o:p></o:p></span><br />
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 13.5pt; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";"> </span><br />
<span style="color: #666666; font-family: "Trebuchet MS","sans-serif"; font-size: 13.5pt; mso-bidi-font-family: "Times New Roman"; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";"><br /></span></div>
<div class="MsoNormal" style="background: white; tab-stops: list 36.0pt; text-indent: -18.0pt;">
</div>
</div>
</div>
</div>
</div>
</div>
<h2>
Conclusion of example demo<o:p></o:p></h2>
<div>
</div>
<div class="MsoNormal">
<div class="MsoNormal" style="background: white; line-height: 13.85pt; margin-bottom: .0001pt; margin: 0cm;">
<span style="color: #666666; font-size: 11.0pt; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";">GT
proves effective in predicting the daily USD trend.<o:p></o:p></span></div>
<div class="MsoNormal" style="background: white; line-height: 13.85pt; margin-bottom: .0001pt; margin: 0cm;">
<span style="color: #666666; font-size: 11.0pt; mso-bidi-font-family: Arial; mso-bidi-theme-font: minor-bidi; mso-fareast-font-family: "Times New Roman";">Finding
the patterns (clusters) enables separate prediction to each segment and a
greater precision.<o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<br /></div>
</div>
<div class="MsoNormal">
<br /></div>
<h2>
GT success is in its Industrial & Management Engineering roots<o:p></o:p></h2>
<div>
</div>
<div>
</div>
<div class="MsoNormal" style="margin-left: 18.0pt; mso-list: l0 level1 lfo4; text-indent: -18.0pt;">
<div style="margin-bottom: .0001pt; margin-bottom: 0cm; margin-left: 21.75pt; margin-right: 0cm; margin-top: 0cm; mso-list: l0 level1 lfo1; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">a.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">Its first application was
on-the-job where the assignment was much practical, to improve the line
work-flow, not to invent a theoretical model.<o:p></o:p></span></div>
<div style="margin-bottom: .0001pt; margin: 0cm;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">b.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">Industrial Engineers are
almost never expert in the area of application, therefore the model needed to
be strengthened with scientific internal validations.<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">c.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">As often done in IE the
development was carried out without investors. That fact enabled a very long
incubation period and the evolvement of important personal experience.<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">d.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">The IE practical approach
led to focusing on "discovery of hidden patterns", instead of the
more academic approach that prioritizes correlations and the speed of
execution.<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">e.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">Full cycle product costs
of implementation are considered, no hard sell wizardry.<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">f.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">Real work forced starting
the algorithm ahead of time, which turned out to help greatly to avoid
conventional misconceptions...<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">g.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">Product development means
primarily its work method substantiation, not its market-share.<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">h.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">From IE perspective it is
only natural to offer an option of SaaS.<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">i.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">IE should always adhere to
the actual implementation on top of business musts.<o:p></o:p></span></div>
<div style="margin: 0cm 0cm 0.0001pt;">
<br /></div>
<div style="margin: 0cm 0cm 0.0001pt 21.75pt; text-indent: -21.75pt;">
<!--[if !supportLists]--><span style="font-family: Arial, sans-serif; font-size: 11pt;">j.<span style="font-family: 'Times New Roman'; font-size: 7pt; font-stretch: normal;">
</span></span><!--[endif]--><span dir="LTR"></span><span style="font-family: Arial, sans-serif; font-size: 11pt;">High-tech or not "we
do business the old way, we earn it".<o:p></o:p></span></div>
<br />
<div class="MsoNormal">
<br /></div>
<div style="margin-bottom: .0001pt; margin: 0cm;">
<br /></div>
</div>
<div class="MsoNormal" style="margin-left: 18.0pt; mso-list: l0 level1 lfo4; text-indent: -18.0pt;">
<o:p></o:p></div>
<div class="MsoNormal">
<br /></div>
<div class="MsoNormal">
~~~~</div>
<div class="MsoNormal">
<br />
Edith Ohri, Home of GT data mining<o:p></o:p></div>
Edith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.com0tag:blogger.com,1999:blog-138533131377515866.post-14111049310518921212015-09-15T05:40:00.003-07:002015-09-20T05:41:45.617-07:00Digging in financial data<h2>
Using any data for in-depth conclusions</h2>
<h3>
<b>Lessons from a </b> GT study<b>* of 1,000 NYSE companies from year 2000, just before the dot-com bubble crash. </b></h3>
---------------<br />
<span style="font-size: x-small;">* <a href="https://docs.google.com/file/d/0B1tc2-">https://docs.google.com/file/d/0B1tc2-</a></span><br />
<a href="https://docs.google.com/file/d/0B1tc2-%20duf3_4YzM2M2M2OWMtZjAwNS00Y2FlLWJhOWUtOTc3ZjM3NTY1YzVm/edit?usp=sharing" target="_blank"><span style="font-size: x-small;">duf3_4YzM2M2M2OWMtZjAwNS00Y2FlLWJhOWUtOTc3ZjM3NTY1YzVm/edit?usp=sharing</span></a><br />
<div>
<br /></div>
<br />
<u>Conclusion 1</u><br />
<div>
<b>A pattern of behavior can be as small as a fraction of 1% of the total number of events. </b><br />
In this study, GT found a tiny subgroup containing only 4 out of 1000 "exception" companies. It consists of 4 banks with the exception feature of very high net profit - twice as much as others in the financial sector. An explanation to their unusual high performances was offered 8 years(!) later, during the 2008 credit /derivatives crisis, when the 4 banks' names were mentioned in news headlines.</div>
<div>
<br />
<u>Conclusion 2</u></div>
<div>
<b>Large data sets require a general view on top of the detailed one.</b>Here the general view fit almost exactly the common Industries definition. There is only one difference, yet a most significant one, some giant corporations are found to behave like financial institutes rather than their own Industries. This observation strengthen our understanding of the 2008 crisis.<br />
<br />
<u>Conclusion 3</u></div>
<div>
<b>Big data is about using AVAILABLE unsupervised data, without cleaning as commonly suggested.</b><br />
This study is based on free data from http://www.ics.uci.edu. The data quality seems insufficient for research: there is no historical "depth", no shares value information, and the sample does not reflect the subgroups. Yet, GT turned quite good results. It means that data are useful even if partial and unsupervised! </div>
<div>
<ol>
</ol>
<div>
<div>
<br /></div>
</div>
</div>
Edith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.com0tag:blogger.com,1999:blog-138533131377515866.post-4633936689599316272013-06-13T05:45:00.001-07:002013-09-28T15:31:00.384-07:00Some thoughts on big data challenges<strong></strong><br />
<span style="font-family: Times, "Times New Roman", serif;"><strong>Here is a list of challenges from my personal encounters with the subject:</strong><br />
</span><br />
<ol><span style="font-family: Times, "Times New Roman", serif;">
<li><strong>How to make use of unsupervised data? </strong></li>
<li><strong>Untangling mixed phenomena</strong></li>
<li><strong>The need for on time (unexpected) decisions</strong></li>
<li><strong>Identifying "black swans"</strong></li>
<li><strong>Deploying legacy data - this is similar to #1 using unsupervised data</strong></li>
<li><strong>Devising a method for exponential growth of data </strong></li>
<li><strong>Using old tools in a new environment</strong></li>
<li><strong><strong>Is there any size that is too big to handle?</strong></strong></li>
<li><strong>Statistics in a dynamic reality</strong></li>
<li><strong>What would be considered a right hypothesis? <br />(or is there such a thing as a wrong question to ask?)</strong></li>
</span></ol>
<br />
~~~~~~~~~~~<br />
<div class="MsoNormal" style="margin: 1in 0.5in 0.5pt;">
A Buddhist story about blind men trying to describe an elephant:</div>
<br />
<br />
<div class="MsoNormal" style="margin: 0in 0.5in 0.5pt;">
<span style="font-family: Calibri;">Five blind people were asked to describe an elephant. Each felt
a part of the elephant. One person felt the elephant's trunk and said
it is just like a plow pole. A second person touched the elephant's foot
and said it is just like a post. A third person felt the elephant's tusk and
said it is just like a plowshare. A fourth person had a hold of the elephant's
tail and said it is just like a broom. A fifth person felt the elephant's
ear and said it is like a winnowing basket. As each one described the elephant, the
others disagreed...<o:p></o:p></span></div>
<br />
<strong></strong><br />
<div class="separator" style="clear: both; text-align: center;">
<a href="http://www.yoism.org/images/elephant.jpg" imageanchor="1" style="margin-left: 1em; margin-right: 1em;"><img alt="" border="0" height="177" src="http://www.yoism.org/images/elephant.jpg" title="Old print of ''The Blind Men and the Elephant''" width="320" /></a></div>
<div style="text-align: center;">
<a href="http://www.yoism.org/?q=node/221"><span style="font-size: xx-small;">http://www.yoism.org/?q=node/221</span></a></div>
<br />Edith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.com1tag:blogger.com,1999:blog-138533131377515866.post-18481243634948380412013-06-05T11:01:00.000-07:002017-06-14T02:58:34.569-07:00First introduction<div dir="LTR">
<div dir="LTR">
Why GT, and why data mining at all?<br />
My quest for mining algorithm started a long while ago. I sort of grew up with that field. It intrigued me to know, how natural formation of data (clusters) occur? Are there any principles? And how may one make use of them?<br />
In this blog I'll try to write about these and other subjects of what makes data mining tick.<u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
Thanks to Avishai Schur from <a href="http://www.fabhighq.com/" target="_blank"><span style="color: blue;">FabHighQ</span></a> for encouraging me to open this blog. See also presentation list and posts in Hebrew at http://gtdatamning-heb.blogspot.co.il/<o:p></o:p></div>
<div dir="LTR">
</div>
Your comments are appreciated.<br />
Edith</div>
<div dir="LTR">
<o:p></o:p></div>
<div dir="LTR">
</div>
<div dir="LTR">
</div>
<div dir="LTR">
</div>
<div dir="LTR" style="margin: 0.25in 0in 12pt;">
<u1:p></u1:p><u2:p></u2:p><b><i><span style="font-size: 16pt;">What
is GT and what does it stand for?</span></i></b><b><span style="font-size: 16pt;"><o:p></o:p></span></b></div>
<div dir="LTR">
<u1:p></u1:p>GT is a solution for creating new hypotheses based on
identifying patterns of behavior. The special thing about it is <b>hierarchical
clustering</b> and analytics (analysis) of <b>unsupervised data</b>. <u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
</div>
<div dir="LTR" style="margin: 0.25in 0in 12pt;">
<b><span style="font-size: 16pt;"><i>Origins </i></span></b></div>
<div dir="LTR">
The name GT stands for Group Technology that is an old method of
Industrial Engineering aiming to increase the efficiency of production and
material handling by grouping items according to their similarity. In
today's work environment its function is extended from the original
shop-floor management to the management of "any type of database entities". GT can be regarded in this sense, as the abstract/universal generalized
model of the old Group Technology. <u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
Group Technology consists of several methods that were developed
through the years, starting in World War II when the Russians needed
to relocate their factories and move them to the East where they could be safe from the advancing German army. Their idea was to keep the
different product lines in a simple order that would be quick to reconstruct.
That order which they defined resembled the western "production
line" approach, with one difference - instead of work-orders for
identical items, the Russians allowed Groups of mixed items that shared the
same processing route. <u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
</div>
<div dir="LTR" style="margin: 0.25in 0in 12pt;">
<b><span style="font-size: 16pt;"><i>Evolvement</i></span></b></div>
<div dir="LTR">
The Group approach gained more and more appeal in the West due to
(to the best of my knowledge) two emerging technologies that later on swept the
manufacturing world: <u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
(a) <strong>Operations Research</strong> with its efficiency
optimization – one should mention a prominent professor at Cranefield
University England – Sir John Burbidge, who was knighted by the Queen for his
activity in this field.<u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
(b) <strong>Flexible Manufacturing</strong> developed in Japan as
part of the CNC and cell-production concept. <u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
Both technologies – Operations Research and Flexible
Manufacturing, had to deal with increasingly diversified products and
activities, for which the flexibility embedded in the multi-functional Groups,
had a tremendous advantage compared to the rigid idea of dedicated mass production lines.<br />
<o:p></o:p></div>
<div dir="LTR">
</div>
<div dir="LTR">
Then in the 80's, a third leap occurred that brought forward the GT
idea as a desirable solution - the <strong>IT revolution</strong>. IT has
introduced 'information' as an item by itself (not just adjacent to 'real'
items) and by this it opened the door to many new products and changes in the
organization and the whole commercial scene. As IT redefined
almost everything it needed also to rebalance and regain efficiency, and
the GT ability to organize the work in Groups or Clusters according to
processing sequence, has proven more valid than ever. This need to reorganize production was the initial aim of my GT data mining algorithm.<u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
</div>
<div dir="LTR">
All the above-mentioned upheavals were, as it appears, just an introduction
to what comes to be known as <strong>Big Data</strong>. Big Data poses new
challenges to data mining analysts, and mostly two features which have now become critical – <b>AI and automation</b>. <u1:p></u1:p><u2:p></u2:p><o:p></o:p></div>
<div dir="LTR">
But how to generate AI rules automatically? Can we replace the
expert in creating insights, observations, and new hypotheses?<br />
<u2:p></u2:p><o:p></o:p><br />
<br />
<u2:p> </u2:p><b><span style="font-size: 16pt;"><i>For testing purposes we have well
developed methods, but for creating hypotheses (to the testing) - nothing! </i></span></b></div>
<div dir="LTR">
This statement deserves a whole discussion of its own. For a start, the basic
solution of GT data mining is about making new rules and validating them methodically. <br />
<o:p></o:p>
</div>
<div dir="LTR">
<u2:p></u2:p><br />
<u2:p></u2:p><br />
<u2:p> </u2:p><o:p></o:p>
<img src="https://blogger.googleusercontent.com/img/b/R29vZ2xl/AVvXsEgcrRWEJAoTohhYQEpl5hRPTuWHxtfDxfgsKvpMGjLkIwLl_UJIusnbv8Poz9NUnRh9h24vE4En8AakDlKqcaDVnGHkgqo7lG1-bi8TI8WoCScJERPwbWYjiJd6jfhN2IONWPJT78krPi81/s320/Follow_us.jpg" />
</div>
<script>
(function(i,s,o,g,r,a,m){i['GoogleAnalyticsObject']=r;i[r]=i[r]||function(){
(i[r].q=i[r].q||[]).push(arguments)},i[r].l=1*new Date();a=s.createElement(o),
m=s.getElementsByTagName(o)[0];a.async=1;a.src=g;m.parentNode.insertBefore(a,m)
})(window,document,'script','//www.google-analytics.com/analytics.js','ga');
ga('create', 'UA-41562376-1', 'gtdatamining.blogspot.co.il');
ga('send', 'pageview');
</script><br />Edith Ohrihttp://www.blogger.com/profile/08106475396227818104noreply@blogger.com0