<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Transitional//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-transitional.dtd"> <html xmlns="http://www.w3.org/1999/xhtml"> <head> <meta http-equiv="Content-Type" content="text/html; charset=utf-8" /> <title>Finding structure in the stock market — scikits.learn v0.6.0 documentation</title> <link rel="stylesheet" href="../../_static/nature.css" type="text/css" /> <link rel="stylesheet" href="../../_static/pygments.css" type="text/css" /> <script type="text/javascript"> var DOCUMENTATION_OPTIONS = { URL_ROOT: '../../', VERSION: '0.6.0', COLLAPSE_INDEX: false, FILE_SUFFIX: '.html', HAS_SOURCE: true }; </script> <script type="text/javascript" src="../../_static/jquery.js"></script> <script type="text/javascript" src="../../_static/underscore.js"></script> <script type="text/javascript" src="../../_static/doctools.js"></script> <link rel="shortcut icon" href="../../_static/favicon.ico"/> <link rel="author" title="About these documents" href="../../about.html" /> <link rel="top" title="scikits.learn v0.6.0 documentation" href="../../index.html" /> <link rel="up" title="Examples" href="../index.html" /> <link rel="next" title="Libsvm GUI" href="svm_gui.html" /> <link rel="prev" title="Species distribution modeling" href="plot_species_distribution_modeling.html" /> </head> <body> <div class="header-wrapper"> <div class="header"> <p class="logo"><a href="../../index.html"> <img src="../../_static/scikit-learn-logo-small.png" alt="Logo"/> </a> </p><div class="navbar"> <ul> <li><a href="../../install.html">Download</a></li> <li><a href="../../support.html">Support</a></li> <li><a href="../../user_guide.html">User Guide</a></li> <li><a href="../index.html">Examples</a></li> <li><a href="../../developers/index.html">Development</a></li> </ul> <div class="search_form"> <div id="cse" style="width: 100%;"></div> <script src="http://www.google.com/jsapi" type="text/javascript"></script> <script type="text/javascript"> google.load('search', '1', {language : 'en'}); google.setOnLoadCallback(function() { var customSearchControl = new google.search.CustomSearchControl('016639176250731907682:tjtqbvtvij0'); customSearchControl.setResultSetSize(google.search.Search.FILTERED_CSE_RESULTSET); var options = new google.search.DrawOptions(); options.setAutoComplete(true); customSearchControl.draw('cse', options); }, true); </script> </div> </div> <!-- end navbar --></div> </div> <div class="content-wrapper"> <!-- <div id="blue_tile"></div> --> <div class="sphinxsidebar"> <div class="rel"> <a href="plot_species_distribution_modeling.html" title="Species distribution modeling" accesskey="P">previous</a> | <a href="svm_gui.html" title="Libsvm GUI" accesskey="N">next</a> | <a href="../../genindex.html" title="General Index" accesskey="I">index</a> </div> <h3>Contents</h3> <ul> <li><a class="reference internal" href="#">Finding structure in the stock market</a></li> </ul> </div> <div class="content"> <div class="documentwrapper"> <div class="bodywrapper"> <div class="body"> <div class="section" id="finding-structure-in-the-stock-market"> <span id="example-applications-stock-market-py"></span><h1>Finding structure in the stock market<a class="headerlink" href="#finding-structure-in-the-stock-market" title="Permalink to this headline">ΒΆ</a></h1> <p>An example of playing with stock market data to try and find some structure in it.</p> <p><strong>Python source code:</strong> <a class="reference download internal" href="../../_downloads/stock_market.py"><tt class="xref download docutils literal"><span class="pre">stock_market.py</span></tt></a></p> <div class="highlight-python"><div class="highlight"><pre><span class="k">print</span> <span class="n">__doc__</span> <span class="c"># Author: Gael Varoquaux gael.varoquaux@normalesup.org</span> <span class="c"># License: BSD</span> <span class="kn">import</span> <span class="nn">datetime</span> <span class="kn">from</span> <span class="nn">matplotlib</span> <span class="kn">import</span> <span class="n">finance</span> <span class="kn">import</span> <span class="nn">numpy</span> <span class="kn">as</span> <span class="nn">np</span> <span class="kn">from</span> <span class="nn">scikits.learn</span> <span class="kn">import</span> <span class="n">cluster</span> <span class="c"># Choose a time period reasonnably calm (not too long ago so that we get</span> <span class="c"># high-tech firms, and before the 2008 crash)</span> <span class="n">d1</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2003</span><span class="p">,</span> <span class="mo">01</span><span class="p">,</span> <span class="mo">01</span><span class="p">)</span> <span class="n">d2</span> <span class="o">=</span> <span class="n">datetime</span><span class="o">.</span><span class="n">datetime</span><span class="p">(</span><span class="mi">2008</span><span class="p">,</span> <span class="mo">01</span><span class="p">,</span> <span class="mo">01</span><span class="p">)</span> <span class="n">symbol_dict</span> <span class="o">=</span> <span class="p">{</span> <span class="s">'TOT'</span> <span class="p">:</span> <span class="s">'Total'</span><span class="p">,</span> <span class="s">'XOM'</span> <span class="p">:</span> <span class="s">'Exxon'</span><span class="p">,</span> <span class="s">'CVX'</span> <span class="p">:</span> <span class="s">'Chevron'</span><span class="p">,</span> <span class="s">'COP'</span> <span class="p">:</span> <span class="s">'ConocoPhillips'</span><span class="p">,</span> <span class="s">'VLO'</span> <span class="p">:</span> <span class="s">'Valero Energy'</span><span class="p">,</span> <span class="s">'MSFT'</span> <span class="p">:</span> <span class="s">'Microsoft'</span><span class="p">,</span> <span class="s">'IBM'</span> <span class="p">:</span> <span class="s">'IBM'</span><span class="p">,</span> <span class="s">'TWX'</span> <span class="p">:</span> <span class="s">'Time Warner'</span><span class="p">,</span> <span class="s">'CMCSA'</span><span class="p">:</span> <span class="s">'Comcast'</span><span class="p">,</span> <span class="s">'CVC'</span> <span class="p">:</span> <span class="s">'Cablevision'</span><span class="p">,</span> <span class="s">'YHOO'</span> <span class="p">:</span> <span class="s">'Yahoo'</span><span class="p">,</span> <span class="s">'DELL'</span> <span class="p">:</span> <span class="s">'Dell'</span><span class="p">,</span> <span class="s">'HPQ'</span> <span class="p">:</span> <span class="s">'Hewlett-Packard'</span><span class="p">,</span> <span class="s">'AMZN'</span> <span class="p">:</span> <span class="s">'Amazon'</span><span class="p">,</span> <span class="s">'TM'</span> <span class="p">:</span> <span class="s">'Toyota'</span><span class="p">,</span> <span class="s">'CAJ'</span> <span class="p">:</span> <span class="s">'Canon'</span><span class="p">,</span> <span class="s">'MTU'</span> <span class="p">:</span> <span class="s">'Mitsubishi'</span><span class="p">,</span> <span class="s">'SNE'</span> <span class="p">:</span> <span class="s">'Sony'</span><span class="p">,</span> <span class="s">'F'</span> <span class="p">:</span> <span class="s">'Ford'</span><span class="p">,</span> <span class="s">'HMC'</span> <span class="p">:</span> <span class="s">'Honda'</span><span class="p">,</span> <span class="s">'NAV'</span> <span class="p">:</span> <span class="s">'Navistar'</span><span class="p">,</span> <span class="s">'NOC'</span> <span class="p">:</span> <span class="s">'Northrop Grumman'</span><span class="p">,</span> <span class="s">'BA'</span> <span class="p">:</span> <span class="s">'Boeing'</span><span class="p">,</span> <span class="s">'KO'</span> <span class="p">:</span> <span class="s">'Coca Cola'</span><span class="p">,</span> <span class="s">'MMM'</span> <span class="p">:</span> <span class="s">'3M'</span><span class="p">,</span> <span class="s">'MCD'</span> <span class="p">:</span> <span class="s">'Mc Donalds'</span><span class="p">,</span> <span class="s">'PEP'</span> <span class="p">:</span> <span class="s">'Pepsi'</span><span class="p">,</span> <span class="s">'KFT'</span> <span class="p">:</span> <span class="s">'Kraft Foods'</span><span class="p">,</span> <span class="s">'K'</span> <span class="p">:</span> <span class="s">'Kellogg'</span><span class="p">,</span> <span class="s">'UN'</span> <span class="p">:</span> <span class="s">'Unilever'</span><span class="p">,</span> <span class="s">'MAR'</span> <span class="p">:</span> <span class="s">'Marriott'</span><span class="p">,</span> <span class="s">'PG'</span> <span class="p">:</span> <span class="s">'Procter Gamble'</span><span class="p">,</span> <span class="s">'CL'</span> <span class="p">:</span> <span class="s">'Colgate-Palmolive'</span><span class="p">,</span> <span class="s">'NWS'</span> <span class="p">:</span> <span class="s">'News Corporation'</span><span class="p">,</span> <span class="s">'GE'</span> <span class="p">:</span> <span class="s">'General Electrics'</span><span class="p">,</span> <span class="s">'WFC'</span> <span class="p">:</span> <span class="s">'Wells Fargo'</span><span class="p">,</span> <span class="s">'JPM'</span> <span class="p">:</span> <span class="s">'JPMorgan Chase'</span><span class="p">,</span> <span class="s">'AIG'</span> <span class="p">:</span> <span class="s">'AIG'</span><span class="p">,</span> <span class="s">'AXP'</span> <span class="p">:</span> <span class="s">'American express'</span><span class="p">,</span> <span class="s">'BAC'</span> <span class="p">:</span> <span class="s">'Bank of America'</span><span class="p">,</span> <span class="s">'GS'</span> <span class="p">:</span> <span class="s">'Goldman Sachs'</span><span class="p">,</span> <span class="s">'AAPL'</span> <span class="p">:</span> <span class="s">'Apple'</span><span class="p">,</span> <span class="s">'SAP'</span> <span class="p">:</span> <span class="s">'SAP'</span><span class="p">,</span> <span class="s">'CSCO'</span> <span class="p">:</span> <span class="s">'Cisco'</span><span class="p">,</span> <span class="s">'TXN'</span> <span class="p">:</span> <span class="s">'Texas instruments'</span><span class="p">,</span> <span class="s">'XRX'</span> <span class="p">:</span> <span class="s">'Xerox'</span><span class="p">,</span> <span class="s">'LMT'</span> <span class="p">:</span> <span class="s">'Lookheed Martin'</span><span class="p">,</span> <span class="s">'WMT'</span> <span class="p">:</span> <span class="s">'Wal-Mart'</span><span class="p">,</span> <span class="s">'WAG'</span> <span class="p">:</span> <span class="s">'Walgreen'</span><span class="p">,</span> <span class="s">'HD'</span> <span class="p">:</span> <span class="s">'Home Depot'</span><span class="p">,</span> <span class="s">'GSK'</span> <span class="p">:</span> <span class="s">'GlaxoSmithKline'</span><span class="p">,</span> <span class="s">'PFE'</span> <span class="p">:</span> <span class="s">'Pfizer'</span><span class="p">,</span> <span class="s">'SNY'</span> <span class="p">:</span> <span class="s">'Sanofi-Aventis'</span><span class="p">,</span> <span class="s">'NVS'</span> <span class="p">:</span> <span class="s">'Novartis'</span><span class="p">,</span> <span class="s">'KMB'</span> <span class="p">:</span> <span class="s">'Kimberly-Clark'</span><span class="p">,</span> <span class="s">'R'</span> <span class="p">:</span> <span class="s">'Ryder'</span><span class="p">,</span> <span class="s">'GD'</span> <span class="p">:</span> <span class="s">'General Dynamics'</span><span class="p">,</span> <span class="s">'RTN'</span> <span class="p">:</span> <span class="s">'Raytheon'</span><span class="p">,</span> <span class="s">'CVS'</span> <span class="p">:</span> <span class="s">'CVS'</span><span class="p">,</span> <span class="s">'CAT'</span> <span class="p">:</span> <span class="s">'Caterpillar'</span><span class="p">,</span> <span class="s">'DD'</span> <span class="p">:</span> <span class="s">'DuPont de Nemours'</span><span class="p">,</span> <span class="p">}</span> <span class="n">symbols</span><span class="p">,</span> <span class="n">names</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">(</span><span class="n">symbol_dict</span><span class="o">.</span><span class="n">items</span><span class="p">())</span><span class="o">.</span><span class="n">T</span> <span class="n">quotes</span> <span class="o">=</span> <span class="p">[</span><span class="n">finance</span><span class="o">.</span><span class="n">quotes_historical_yahoo</span><span class="p">(</span><span class="n">symbol</span><span class="p">,</span> <span class="n">d1</span><span class="p">,</span> <span class="n">d2</span><span class="p">,</span> <span class="n">asobject</span><span class="o">=</span><span class="bp">True</span><span class="p">)</span> <span class="k">for</span> <span class="n">symbol</span> <span class="ow">in</span> <span class="n">symbols</span><span class="p">]</span> <span class="c">#volumes = np.array([q.volume for q in quotes]).astype(np.float)</span> <span class="nb">open</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="n">q</span><span class="o">.</span><span class="n">open</span> <span class="k">for</span> <span class="n">q</span> <span class="ow">in</span> <span class="n">quotes</span><span class="p">])</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">float</span><span class="p">)</span> <span class="n">close</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">array</span><span class="p">([</span><span class="n">q</span><span class="o">.</span><span class="n">close</span> <span class="k">for</span> <span class="n">q</span> <span class="ow">in</span> <span class="n">quotes</span><span class="p">])</span><span class="o">.</span><span class="n">astype</span><span class="p">(</span><span class="n">np</span><span class="o">.</span><span class="n">float</span><span class="p">)</span> <span class="n">variation</span> <span class="o">=</span> <span class="n">close</span> <span class="o">-</span> <span class="nb">open</span> <span class="n">correlations</span> <span class="o">=</span> <span class="n">np</span><span class="o">.</span><span class="n">corrcoef</span><span class="p">(</span><span class="n">variation</span><span class="p">)</span> <span class="n">_</span><span class="p">,</span> <span class="n">labels</span> <span class="o">=</span> <span class="n">cluster</span><span class="o">.</span><span class="n">affinity_propagation</span><span class="p">(</span><span class="n">correlations</span><span class="p">)</span> <span class="k">for</span> <span class="n">i</span> <span class="ow">in</span> <span class="nb">range</span><span class="p">(</span><span class="n">labels</span><span class="o">.</span><span class="n">max</span><span class="p">()</span><span class="o">+</span><span class="mi">1</span><span class="p">):</span> <span class="k">print</span> <span class="s">'Cluster </span><span class="si">%i</span><span class="s">: </span><span class="si">%s</span><span class="s">'</span> <span class="o">%</span> <span class="p">((</span><span class="n">i</span><span class="o">+</span><span class="mi">1</span><span class="p">),</span> <span class="s">', '</span><span class="o">.</span><span class="n">join</span><span class="p">(</span><span class="n">names</span><span class="p">[</span><span class="n">labels</span><span class="o">==</span><span class="n">i</span><span class="p">]))</span> </pre></div> </div> </div> </div> </div> </div> <div class="clearer"></div> </div> </div> <div class="footer"> <p style="text-align: center">This documentation is relative to scikits.learn version 0.6.0<p> © 2010, scikits.learn developers (BSD Lincense). Created using <a href="http://sphinx.pocoo.org/">Sphinx</a> 1.0.5. Design by <a href="http://webylimonada.com">Web y Limonada</a>. </div> </body> </html>