-
-
Notifications
You must be signed in to change notification settings - Fork 637
Enable est column width calculation #866
base: master
Are you sure you want to change the base?
Conversation
@adrilo I noticed two tests are failing and it seems the test has an error... The two particular resource files involved are generated from a unix environment instead of the current environment.. I think those two files should be dynamically generated to ensure they are compatible with the particular environment.. Meanwhile I've attached further examples of using fixed width style calculation below..
|
* @param $pos Position to insert | ||
* @param $content Content to insert | ||
*/ | ||
public static function insertToFile($fp, $pos, $content) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This can be very costly to rewrite the contents, especially with large spreadsheet
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
True, I will try both approaches and do a benchmark today. It didn't seem to matter for small files.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrilo So I ran some tests and this is not actually a bad solution at all... (performance differences range from 3% to 7% and seems to get better for larger amount of rows).. (Only tested 100,000, 500,000 and 1 million rows though)
The difference are quite negligible most likely because the file is still in memory the whole time and the only costly operation here is really calling stream_get_contents
.
|
||
/** | ||
* Instead of seeking and re-writing from position, a better hack might be to write dummy empty data | ||
* Enough to take care of any length, then carefully overwrite |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, that's the strategy I've used elsewhere
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@adrilo Benchmark result with overwriting empty spaces... (negligible again but may be useful)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tested 5,000 rows, 500,000 rows and 50,000 rows... The script is included below..
<?php
use Box\Spout\Common\Type;
use Box\Spout\Writer\Common\Creator\Style\StyleBuilder;
use Box\Spout\Writer\Common\Creator\WriterEntityFactory;
require __DIR__ . '/vendor/autoload.php';
ini_set('memory_limit', -1);
function generateRows()
{
$headers = [];
for($i=0;$i<10;$i++) {
$n = $i+1;
$headers[] = "COL-$n";
}
yield $headers;
for($i=0; $i< 500000; $i++) {
$row = [];
$rand = mt_rand(1, 30);
foreach ($headers as $header) {
$randLen = mt_rand(1, $rand);
$row[] = generateRandomString($randLen);
}
yield $row;
}
}
function generateRandomString($length = 10) {
$characters = 'aaaaeeeeiiiiiiiioooouuuubcdefghijklmnopqrstuvwxyz';
$charactersLength = strlen($characters);
$randomString = '';
for ($i = 0; $i < $length; $i++) {
$randomString .= $characters[rand(0, $charactersLength - 1)];
}
return $randomString;
}
$data = generateRows();
$defaultStyle = (new StyleBuilder())
->setFontSize(36)
->build();
$boldtyle = (new StyleBuilder())
->setFontSize(42)
->setFontBold()
->build();
//$writer = WriterEntityFactory::createWriter(Type::ODS);
$start = hrtime(true);
$writer = WriterEntityFactory::createWriter(Type::XLSX);
$writer->setWidthCalculation(1)->setDefaultRowStyle($defaultStyle)->openToFile('spout_trunk.xlsx');
//$writer->openToFile('spouter.xlsx');
foreach ($data as $i => $row) {
if ($i == 0) {
$writer->addRow(WriterEntityFactory::createRowFromArray($row, $boldtyle));
} else {
$writer->addRow(WriterEntityFactory::createRowFromArray($row));
}
}
$writer->close();
$end = hrtime(true);
$elapsed = ($end - $start)/1000000.0;
echo "$elapsed ms counted with truncate".PHP_EOL;
$data = generateRows();
$start = hrtime(true);
$writer = WriterEntityFactory::createWriter(Type::XLSX);
//width calculation style 3 just for testing purposes
$writer->setWidthCalculation(3)->setDefaultRowStyle($defaultStyle)->openToFile('spout_over.xlsx');
//$writer->openToFile('spouter.xlsx');
foreach ($data as $i => $row) {
if ($i == 0) {
$writer->addRow(WriterEntityFactory::createRowFromArray($row, $boldtyle));
} else {
$writer->addRow(WriterEntityFactory::createRowFromArray($row));
}
}
$writer->close();
$end = hrtime(true);
$elapsed = ($end - $start)/1000000.0;
echo "$elapsed ms counted with overwrite";
public function autoSetWidth($cell, $style, $zeroBasedIndex) | ||
{ | ||
$size = 1 + strlen($cell->getValue());//ensure we have at least 1 space | ||
$size *= $style->isFontBold() ? 1.2 : 1.0; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How did you come up with the 1.2 ratio?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's actually meant to be 700/400.. but the 1.2 seemed to fit better.. and that may be because tinier letters like 'i', like you rightly pointed out; occupy less space..
Hi, instead of overwriting dummy portion of the end file, just like ODS already did it with openspout/openspout#45 now also XLSX writes the rows in a temporary file and only after that it writes all the remaining data, thanks to the speed of So now column with, alongside auto-sizing, could be easily supported and already achievable by user code without affecting performances 🚀 |
This is a first step to resolving #129
This approach primarily computes the width size by using an estimate created
After the PR, the library may be used as follows (sample outputs are attached below):
spouter.xlsx
spouter.ods