Sophie

Sophie

distrib > Fedora > 16 > i386 > by-pkgid > 8c64a593ea3f906f1aadde2e298f1490 > files > 24

ghc-csv-devel-0.1.2-11.fc16.i686.rpm

<?xml version="1.0" encoding="UTF-8"?>
<!DOCTYPE html PUBLIC "-//W3C//DTD XHTML 1.0 Strict//EN" "http://www.w3.org/TR/xhtml1/DTD/xhtml1-strict.dtd">
<html>
<head>
<!-- Generated by HsColour, http://code.haskell.org/~malcolm/hscolour/ -->
<title>Text/CSV.hs</title>
<link type='text/css' rel='stylesheet' href='hscolour.css' />
</head>
<body>
<pre><a name="line-1"></a><span class='hs-comment'>{- |
<a name="line-2"></a>   module: Text.CSV 
<a name="line-3"></a>   license: MIT 
<a name="line-4"></a>   maintainer: Jaap Weel &lt;weel at ugcs dot caltech dot edu&gt; 
<a name="line-5"></a>   stability: provisional 
<a name="line-6"></a>   portability: ghc 
<a name="line-7"></a>
<a name="line-8"></a>   This module parses and dumps documents that are formatted more or
<a name="line-9"></a>   less according to RFC 4180, \"Common Format and MIME Type for
<a name="line-10"></a>   Comma-Separated Values (CSV) Files\",
<a name="line-11"></a>   &lt;<a href="http://www.rfc-editor.org/rfc/rfc4180.txt">http://www.rfc-editor.org/rfc/rfc4180.txt</a>&gt;.
<a name="line-12"></a>
<a name="line-13"></a>   There are some issues with this RFC. I will describe what these
<a name="line-14"></a>   issues are and how I deal with them.
<a name="line-15"></a>
<a name="line-16"></a>   First, the RFC prescribes CRLF standard network line breaks, but
<a name="line-17"></a>   you are likely to run across CSV files with other line endings, so
<a name="line-18"></a>   we accept any sequence of CRs and LFs as a line break. 
<a name="line-19"></a>
<a name="line-20"></a>   Second, there is an optional header line, but the format for the
<a name="line-21"></a>   header line is exactly like a regular record and you can only
<a name="line-22"></a>   figure out whether it exists from the mime type, which may not be
<a name="line-23"></a>   available. I ignore the issues of header lines and simply turn them
<a name="line-24"></a>   into regular records.
<a name="line-25"></a>   
<a name="line-26"></a>   Third, there is an inconsistency, in that the formal grammar
<a name="line-27"></a>   specifies that fields can contain only certain US ASCII characters,
<a name="line-28"></a>   but the specification of the MIME type allows for other character
<a name="line-29"></a>   sets. I will allow all characters in fields, except for commas, CRs
<a name="line-30"></a>   and LFs in unquoted fields. This should make it possible to parse
<a name="line-31"></a>   CSV files in any encoding, but it allows for characters such as
<a name="line-32"></a>   tabs that the RFC may be interpreted to forbid even in non-US-ASCII
<a name="line-33"></a>   character sets. 
<a name="line-34"></a>
<a name="line-35"></a>   NOTE: Several people have asked me to implement extensions that are
<a name="line-36"></a>   used in non-US versions Microsoft Excel. This library implements
<a name="line-37"></a>   RFC-compliant CSV, not Microsoft Excel CSV. If you want to write a
<a name="line-38"></a>   library that deals with the CSV-like formats used by non-US versions
<a name="line-39"></a>   of Excel or any other software, you should write a separate library. I
<a name="line-40"></a>   suggest you call it Text.SSV, for "Something Separated Values."
<a name="line-41"></a>-}</span>
<a name="line-42"></a>
<a name="line-43"></a><span class='hs-comment'>{- Copyright (c) Jaap Weel 2007.  Permission is hereby granted, free
<a name="line-44"></a>of charge, to any person obtaining a copy of this software and
<a name="line-45"></a>associated documentation files (the "Software"), to deal in the
<a name="line-46"></a>Software without restriction, including without limitation the rights
<a name="line-47"></a>to use, copy, modify, merge, publish, distribute, sublicense, and/or
<a name="line-48"></a>sell copies of the Software, and to permit persons to whom the
<a name="line-49"></a>Software is furnished to do so, subject to the following conditions:
<a name="line-50"></a>The above copyright notice and this permission notice shall be
<a name="line-51"></a>included in all copies or substantial portions of the Software.  THE
<a name="line-52"></a>SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
<a name="line-53"></a>IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF
<a name="line-54"></a>MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND
<a name="line-55"></a>NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR COPYRIGHT HOLDERS BE
<a name="line-56"></a>LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION
<a name="line-57"></a>OF CONTRACT, TORT OR OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION
<a name="line-58"></a>WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE. -}</span>
<a name="line-59"></a>
<a name="line-60"></a><span class='hs-keyword'>module</span> <span class='hs-conid'>Text</span><span class='hs-varop'>.</span><span class='hs-conid'>CSV</span> <span class='hs-layout'>(</span><span class='hs-conid'>CSV</span>
<a name="line-61"></a>                 <span class='hs-layout'>,</span> <span class='hs-conid'>Record</span>
<a name="line-62"></a>                 <span class='hs-layout'>,</span> <span class='hs-conid'>Field</span>
<a name="line-63"></a>                 <span class='hs-layout'>,</span> <span class='hs-varid'>csv</span>
<a name="line-64"></a>                 <span class='hs-layout'>,</span> <span class='hs-varid'>parseCSV</span>
<a name="line-65"></a>                 <span class='hs-layout'>,</span> <span class='hs-varid'>parseCSVFromFile</span>
<a name="line-66"></a>                 <span class='hs-layout'>,</span> <span class='hs-varid'>parseCSVTest</span>
<a name="line-67"></a>                 <span class='hs-layout'>,</span> <span class='hs-varid'>printCSV</span>
<a name="line-68"></a>                 <span class='hs-layout'>)</span> <span class='hs-keyword'>where</span>
<a name="line-69"></a>
<a name="line-70"></a><span class='hs-keyword'>import</span> <span class='hs-conid'>Text</span><span class='hs-varop'>.</span><span class='hs-conid'>ParserCombinators</span><span class='hs-varop'>.</span><span class='hs-conid'>Parsec</span>
<a name="line-71"></a><span class='hs-keyword'>import</span> <span class='hs-conid'>Data</span><span class='hs-varop'>.</span><span class='hs-conid'>List</span> <span class='hs-layout'>(</span><span class='hs-varid'>intersperse</span><span class='hs-layout'>)</span>
<a name="line-72"></a>
<a name="line-73"></a><a name="CSV"></a><span class='hs-comment'>-- | A CSV file is a series of records. According to the RFC, the</span>
<a name="line-74"></a><a name="CSV"></a><span class='hs-comment'>-- records all have to have the same length. As an extension, I</span>
<a name="line-75"></a><a name="CSV"></a><span class='hs-comment'>-- allow variable length records.</span>
<a name="line-76"></a><a name="CSV"></a><span class='hs-keyword'>type</span> <span class='hs-conid'>CSV</span> <span class='hs-keyglyph'>=</span> <span class='hs-keyglyph'>[</span><span class='hs-conid'>Record</span><span class='hs-keyglyph'>]</span>
<a name="line-77"></a>
<a name="line-78"></a><a name="Record"></a><span class='hs-comment'>-- | A record is a series of fields</span>
<a name="line-79"></a><a name="Record"></a><span class='hs-keyword'>type</span> <span class='hs-conid'>Record</span> <span class='hs-keyglyph'>=</span> <span class='hs-keyglyph'>[</span><span class='hs-conid'>Field</span><span class='hs-keyglyph'>]</span>
<a name="line-80"></a>
<a name="line-81"></a><a name="Field"></a><span class='hs-comment'>-- | A field is a string</span>
<a name="line-82"></a><a name="Field"></a><span class='hs-keyword'>type</span> <span class='hs-conid'>Field</span> <span class='hs-keyglyph'>=</span> <span class='hs-conid'>String</span>
<a name="line-83"></a>
<a name="line-84"></a><a name="csv"></a><span class='hs-comment'>-- | A Parsec parser for parsing CSV files</span>
<a name="line-85"></a><span class='hs-definition'>csv</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>Parser</span> <span class='hs-conid'>CSV</span>
<a name="line-86"></a><span class='hs-definition'>csv</span> <span class='hs-keyglyph'>=</span> <span class='hs-keyword'>do</span> <span class='hs-varid'>x</span> <span class='hs-keyglyph'>&lt;-</span> <span class='hs-varid'>record</span> <span class='hs-varop'>`sepEndBy`</span> <span class='hs-varid'>many1</span> <span class='hs-layout'>(</span><span class='hs-varid'>oneOf</span> <span class='hs-str'>"\n\r"</span><span class='hs-layout'>)</span>
<a name="line-87"></a>         <span class='hs-varid'>eof</span>
<a name="line-88"></a>         <span class='hs-varid'>return</span> <span class='hs-varid'>x</span>
<a name="line-89"></a>
<a name="line-90"></a><a name="record"></a><span class='hs-definition'>record</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>Parser</span> <span class='hs-conid'>Record</span>
<a name="line-91"></a><span class='hs-definition'>record</span> <span class='hs-keyglyph'>=</span> <span class='hs-layout'>(</span><span class='hs-varid'>quotedField</span> <span class='hs-varop'>&lt;|&gt;</span> <span class='hs-varid'>field</span><span class='hs-layout'>)</span> <span class='hs-varop'>`sepBy`</span> <span class='hs-varid'>char</span> <span class='hs-chr'>','</span>
<a name="line-92"></a>
<a name="line-93"></a><a name="field"></a><span class='hs-definition'>field</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>Parser</span> <span class='hs-conid'>Field</span>
<a name="line-94"></a><span class='hs-definition'>field</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>many</span> <span class='hs-layout'>(</span><span class='hs-varid'>noneOf</span> <span class='hs-str'>",\n\r\""</span><span class='hs-layout'>)</span>
<a name="line-95"></a>
<a name="line-96"></a><a name="quotedField"></a><span class='hs-definition'>quotedField</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>Parser</span> <span class='hs-conid'>Field</span>
<a name="line-97"></a><span class='hs-definition'>quotedField</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>between</span> <span class='hs-layout'>(</span><span class='hs-varid'>char</span> <span class='hs-chr'>'"'</span><span class='hs-layout'>)</span> <span class='hs-layout'>(</span><span class='hs-varid'>char</span> <span class='hs-chr'>'"'</span><span class='hs-layout'>)</span> <span class='hs-varop'>$</span>
<a name="line-98"></a>              <span class='hs-varid'>many</span> <span class='hs-layout'>(</span><span class='hs-varid'>noneOf</span> <span class='hs-str'>"\""</span> <span class='hs-varop'>&lt;|&gt;</span> <span class='hs-varid'>try</span> <span class='hs-layout'>(</span><span class='hs-varid'>string</span> <span class='hs-str'>"\"\""</span> <span class='hs-varop'>&gt;&gt;</span> <span class='hs-varid'>return</span> <span class='hs-chr'>'"'</span><span class='hs-layout'>)</span><span class='hs-layout'>)</span>
<a name="line-99"></a>
<a name="line-100"></a><a name="parseCSV"></a><span class='hs-comment'>-- | Given a file name (used only for error messages) and a string to</span>
<a name="line-101"></a><span class='hs-comment'>-- parse, run the parser.</span>
<a name="line-102"></a><span class='hs-definition'>parseCSV</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>FilePath</span> <span class='hs-keyglyph'>-&gt;</span> <span class='hs-conid'>String</span> <span class='hs-keyglyph'>-&gt;</span> <span class='hs-conid'>Either</span> <span class='hs-conid'>ParseError</span> <span class='hs-conid'>CSV</span>
<a name="line-103"></a><span class='hs-definition'>parseCSV</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>parse</span> <span class='hs-varid'>csv</span>
<a name="line-104"></a>
<a name="line-105"></a><a name="parseCSVFromFile"></a><span class='hs-comment'>-- | Given a file name, read from that file and run the parser</span>
<a name="line-106"></a><span class='hs-definition'>parseCSVFromFile</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>FilePath</span> <span class='hs-keyglyph'>-&gt;</span> <span class='hs-conid'>IO</span> <span class='hs-layout'>(</span><span class='hs-conid'>Either</span> <span class='hs-conid'>ParseError</span> <span class='hs-conid'>CSV</span><span class='hs-layout'>)</span>
<a name="line-107"></a><span class='hs-definition'>parseCSVFromFile</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>parseFromFile</span> <span class='hs-varid'>csv</span>
<a name="line-108"></a>
<a name="line-109"></a><a name="parseCSVTest"></a><span class='hs-comment'>-- | Given a string, run the parser, and print the result on stdout.</span>
<a name="line-110"></a><span class='hs-definition'>parseCSVTest</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>String</span> <span class='hs-keyglyph'>-&gt;</span> <span class='hs-conid'>IO</span> <span class='hs-conid'>()</span>
<a name="line-111"></a><span class='hs-definition'>parseCSVTest</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>parseTest</span> <span class='hs-varid'>csv</span>
<a name="line-112"></a>
<a name="line-113"></a><a name="printCSV"></a><span class='hs-comment'>-- | Given an object of type CSV, generate a CSV formatted</span>
<a name="line-114"></a><span class='hs-comment'>-- string. Always uses escaped fields.</span>
<a name="line-115"></a><span class='hs-definition'>printCSV</span> <span class='hs-keyglyph'>::</span> <span class='hs-conid'>CSV</span> <span class='hs-keyglyph'>-&gt;</span> <span class='hs-conid'>String</span>
<a name="line-116"></a><span class='hs-definition'>printCSV</span> <span class='hs-varid'>records</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>unlines</span> <span class='hs-layout'>(</span><span class='hs-varid'>printRecord</span> <span class='hs-varop'>`map`</span> <span class='hs-varid'>records</span><span class='hs-layout'>)</span>
<a name="line-117"></a>    <span class='hs-keyword'>where</span> <span class='hs-varid'>printRecord</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>concat</span> <span class='hs-varop'>.</span> <span class='hs-varid'>intersperse</span> <span class='hs-str'>","</span> <span class='hs-varop'>.</span> <span class='hs-varid'>map</span> <span class='hs-varid'>printField</span>
<a name="line-118"></a>          <span class='hs-varid'>printField</span> <span class='hs-varid'>f</span> <span class='hs-keyglyph'>=</span> <span class='hs-str'>"\""</span> <span class='hs-varop'>++</span> <span class='hs-varid'>concatMap</span> <span class='hs-varid'>escape</span> <span class='hs-varid'>f</span> <span class='hs-varop'>++</span> <span class='hs-str'>"\""</span>
<a name="line-119"></a>          <span class='hs-varid'>escape</span> <span class='hs-chr'>'"'</span> <span class='hs-keyglyph'>=</span> <span class='hs-str'>"\"\""</span>
<a name="line-120"></a>          <span class='hs-varid'>escape</span> <span class='hs-varid'>x</span> <span class='hs-keyglyph'>=</span> <span class='hs-keyglyph'>[</span><span class='hs-varid'>x</span><span class='hs-keyglyph'>]</span>
<a name="line-121"></a>          <span class='hs-varid'>unlines</span> <span class='hs-keyglyph'>=</span> <span class='hs-varid'>concat</span> <span class='hs-varop'>.</span> <span class='hs-varid'>intersperse</span> <span class='hs-str'>"\n"</span>
<a name="line-122"></a>
</pre></body>
</html>