Sometimes when you are developing a project, it might be necessary to do a parsing of xls documents. To give an example: you do a synchronization between xls worksheets and a website database, and you need to convert xls data to the Mysql and want to do it completely automatically.
If you work with Windows it is simple enough – you just need to use COM objects. However, it is another thing if you work with PHP and need to make it work under the UNIX systems. Fortunately there are many classes and libraries for this purpose. One of them is the class PHPExcel. This library is completely cross-platform, so you will not have problems with portability.
PHPxcel allows importing and exporting data with Excel and applying formatting styles to the reports. Scraping tasks often require reading, so this is what we are going to learn here. With PHPExcel we can read the following formats:
- Excel 2007
- Excel 5.0/Excel 95
- Excel 97 nd latest
- PHPExcel Serialized Spreadshet
- Symbolic Link
- CSV
The PHPxcel library is available for downloading here.
The library contains the following classes and files:
/classes // The main catalog of library
/clasess/PHPExcel/ // Catalog of classes of PHPExcel
/classess/PHPExcel.php // The file for including
/documentation // The full documentation
/license.txt // The license
/install.txt // Brief instruction
/examples // Catalog with usage and examples
/changelog.txt // List of changes with the current release
Additionally, it would be a good thing if functionality included storing and getting records from the database. For this purpose we’ll use the PDO class – a convenient instrument to work with MySQL database. Firstly, let’s create a table called “excel_data”:
CREATE TABLE excel_data ( id INT UNSIGNED AUTO_INCREMENT PRIMARY KEY, value VARCHAR(32) )
Reading from Excel file
For the data reading we need the objects of three classes:
PHPExcel_Worksheet_RowIterator – we will use it for row iteration
PHPExcel_Worksheet_CellIterator – we will use it for cell iteration.
PDO – to create a connection to our DB and use it to get data from it.
Below is the example of how we read the data from a worksheet and the database. Let’s say we want to display html table scraping data from excel sheet.
// including library to the code require_once ('PHPExcel-1.8/Classes/PHPExcel/IOFactory.php') // Opening the file $xls = PHPExcel_IOFactory::load('myfile.xls'); // Set up $db, $name, $password variables $db = new PDO('mysql:dbname='.$db.';host = 127.0.0.1', $name, $password); // Setting an active sheet $xls->setActiveSheetIndex(0); // Getting the active sheet into the object of class Spreadsheet_Excel_Writer_Worksheet representing the worksheet $sheet = $xls->getActiveSheet(); // getHighestRow - the method returning the highest row in a worksheet; for ($i = 2; $i <= $sheet->getHighestRow(); $i++) { $nColumn = PHPExcel_Cell::columnIndexFromString($sheet->getHighestColumn()); for ($j = 0; $j < $nColumn; $j++) { $value = $sheet->getCellByColumnAndRow($j, $i)->getValue(); // we add data to the DB using sql query. $query = $db->prepare("INSERT INTO excel_test(value) VALUES(:value)"); $query->bindParam(':value', $value); $query->execute(); } }
The method getHighestColumn returns the name of the highest column in a string presentation (A1, B2, etc). The static method PHPExcel_Cell::columnIndexFromString returns the number of this column.
Now the data are displayed. Of course, we can use it for any purpose: for example, for the updating of prices and stocks in our website product catalog.
Writing into Excel file
Besides reading, we can use PHPExcel for writing – importing data to Excel. Below is the example:
// Including PHPExcel require_once('PHPExcel-1.8/Classes/PHPExcel.php'); // Including class for the output data in the Excel format require_once('PHPExcel-1.8/Classes/PHPExcel/Writer/Excel5.php'); // Creating the object of PHPExcel $xls = new PHPExcel(); // Creating a new connection. Set the $db, $name, $password variables to your connection data $db = new PDO('mysql:dbname='.$db.';host = 127.0.0.1', $name, $password); // Setting the active worksheet index $xls->setActiveSheetIndex(0); // Getting the active worksheet $sheet = $xls->getActiveSheet(); // Setting the title $sheet->setTitle('My sheet'); // As an example, we will get 10 records from db $matrix = $db->prepare("SELECT value FROM excel_test LIMIT 10"); $matrix->execute(); $matrix = $matrix->fetchAll(PDO::FETCH_ASSOC); // Method setCellValueByColumnAndRow() of worksheet class allows us to set the value of cell referencing to it by the row and column number. $i = 1; $j = 1; foreach($matrix as $mat) { $sheet->setCellValueByColumnAndRow($i, $j, $mat['value']); $j++; } //Preparing headers header ( "Expires: Mon, 1 Apr 1974 05:00:00 GMT" ); header ( "Last-Modified: " . gmdate("D,d M YH:i:s") . " GMT" ); header ( "Cache-Control: no-cache, must-revalidate" ); header ( "Pragma: no-cache" ); header ( "Content-type: application/vnd.ms-excel" ); header ( "Content-Disposition: attachment; filename=matrix.xls" ); //Receiving a new excel file! $objWriter = new PHPExcel_Writer_Excel5($xls); $objWriter->save('php://output');
Instead of the method setCellValueByColumnAndRow(), we could use the method setCellValue(), but in this case we would have to reference the cell by row and column name (A1, B2, etc).
We have reviewed the main methods of the PHPExcel class, which will help us to get data from Excel and manage the workbook content. There are a lot of other methods which you can find in documentation.