Javaã§Excelã®èªã¿æ¸ããããæã«ã¯ãApache POIã使ããã¨ãå¤ãã¨æãã¾ããã¾ããPOIã®ä¸ã«ãã³ãã¬ã¼ããçµãã§â¦ã¿ãããªãã®ãããããã§ããã
Apache POI - the Java API for Microsoft Documents
http://poi.apache.org/
POIèªä½ã¯ãExcel以å¤ãæ±ãã¾ããã
ã§ãæè¿ã¡ãã£ã¨ã大ããªExcelãããã°ã©ã ã§åºåããããã ãã©ãã¿ãããªãã¨ãè¨ãããã¡ãã£ã¨POIã®äºæ ã調ã¹ã¦ã¿ã¾ããã
POIã¯Excelåºåå¯è½ã§ãããããªãã¡ã¢ãªãå¤ãåãã®ã§å¤§ããªExcelãåºåãããã¨ããã¨ã¡ãã£ã¨ã¤ããã§ãã
ã¡ãªã¿ã«ãããã§èªåãè¨ãã大ããªExcelãã¨ããã®ã¯ãæ°åä¸è¡ã®ãªã¼ãã¼ã§ãããããããªã®ãExcelã§åºåãã¦ã©ããããã ããã¨ã¯æãã¾ããã
調ã¹ã¦ã¿ãã¨ãããPOI 3.8 beta3以éã§ããã°ãSXSSF APIã¨ãããã®ã使ç¨ãããã¨ã§ä½ã¡ã¢ãªã§Excelçæãå¯è½ã«ãªã£ã¦ããããã§ãã
POI-HSSF and POI-XSSF - Java API To Access Microsoft Excel Format Files
http://poi.apache.org/spreadsheet/index.html
SXSSF (Streaming Usermodel API)
http://poi.apache.org/spreadsheet/how-to.html#sxssf
ãã¡ããããã¬ã¼ããªãã¯ããããã§ãã
POI-HSSF and POI-XSSF - Java API To Access Microsoft Excel Format Filesãã®ãã¼ã¸ã®è¡¨ã«ãããã¾ããã
- Sheetï¼Rowï¼Cellã®åé¤ã¯ã§ããªã
- Rowã®ã·ããã¯ã§ããªã
- Sheetã®ã¯ãã¼ã³ä¸å¯
- æ°å¼ã®ä½¿ç¨ã¯ä¸å¯
- Cellã¸ã®ã³ã¡ã³ãã®ä½¿ç¨ã¯ä¸å¯
ã¨ããã®ãããã¾ãããã¨ãä¸æãã¡ã¤ã«ãä½æããã¾ãã
ããã¾ã話ã¨ãã¦ã¯Excel 2007以éã®ãã©ã¼ãããï¼xlsxï¼ã§ããããã
使ã£ã¦ã¿ã
ã§ã¯ãæ©é使ã£ã¦ã¿ã¾ããããMavenä¾åé¢ä¿ã«ã¯ã以ä¸ãå ãã¾ãã
<dependency> <groupId>org.apache.poi</groupId> <artifactId>poi-ooxml</artifactId> <version>3.10.1</version> </dependency>
ç¾å¨ã®ææ°å®å®çã¯ã3.10.1ã§ãã
ä¾åé¢ä¿ã¨ãã¦ã¯artifactIdã¨ãã¦ãpoiãã¨ãpoi-ooxml-schemasããå¿
è¦ã§ãããä¸è¨ãå ããã¨ããããèªåçã«è¿½å ããã¾ãã
ãã£ãããªã®ã§ãXSSF APIã¨SXSSF APIãåãæ¿ãããµã³ãã«ã§æ¸ãã¦ã¿ã¾ãã
src/main/java/ExcelGen.java
import java.io.FileOutputStream; import java.io.IOException; import java.util.stream.IntStream; import org.apache.poi.ss.usermodel.Row; import org.apache.poi.ss.usermodel.Sheet; import org.apache.poi.ss.usermodel.Workbook; import org.apache.poi.xssf.usermodel.XSSFWorkbook; import org.apache.poi.xssf.streaming.SXSSFWorkbook; public class ExcelGen { public static void main(String[] args) { String type = args[0]; int rows = Integer.parseInt(args[1]); String fileName = args[2]; Workbook workbook; switch (type.toUpperCase()) { case "XSSF": workbook = new XSSFWorkbook(); break; case "SXSSF": workbook = new SXSSFWorkbook(); break; default: workbook = new XSSFWorkbook(); break; } Sheet sheet = workbook.createSheet(); IntStream .range(0, rows) .forEach(i -> { Row row = sheet.createRow(i); row.createCell(0).setCellValue("ãã¹ã-" + (i + 1)); row.createCell(1).setCellValue("foo-" + (i + 1)); row.createCell(2).setCellValue("bar-" + (i + 1)); }); try (FileOutputStream fos = new FileOutputStream(fileName)) { workbook.write(fos); } catch (IOException e) { e.printStackTrace(); } } }
èµ·åå¼æ°ã§ãXSSF APIãSXSSF APIããé¸æããããã«ãã¦ããã以éã®ã³ã¼ãã¯ãã¹ã¦åãã§ãã
case "XSSF": workbook = new XSSFWorkbook(); break; case "SXSSF": workbook = new SXSSFWorkbook(); break;
第2å¼æ°ã¯åºåããExcelã®è¡æ°ã第3å¼æ°ã¯åºåãããã¡ã¤ã«åã§ãã
å®è¡ã¯ãé¢åãªã®ã§ãjava -jarãã§å®è¡ã§ããããã«ãmaven-shade-pluginã使ç¨ãã¾ããã
<plugin> <groupId>org.apache.maven.plugins</groupId> <artifactId>maven-shade-plugin</artifactId> <version>2.3</version> <configuration> <transformers> <transformer implementation="org.apache.maven.plugins.shade.resource.ManifestResourceTransformer"> <mainClass>ExcelGen</mainClass> </transformer> </transformers> </configuration> <executions> <execution> <phase>package</phase> <goals> <goal>shade</goal> </goals> </execution> </executions> </plugin>
ããã±ã¼ã¸ã³ã°ãã¦
$ mvn package
ãããªæãã§å®è¡ã
$ java -jar target/large-exel-0.0.1-SNAPSHOT.jar [ãã©ã¼ããã] [è¡æ°] [åºåãã¡ã¤ã«å]
ã¾ããXSSF APIã使ã£ã¦ã¿ã¾ããã¡ã¢ãªã¯256Mã¾ã§ã«ãã¦ã10ä¸è¡ã®Excelãä½ã£ã¦ã¿ã¾ãããã
ï¼timeãã¤ãã¦ã¿ã¾ãã
$ time java -Xmx256M -jar target/large-exel-0.0.1-SNAPSHOT.jar XSSF 100000 xssf.xlsx
çµæãããããæéãããã£ãæå¥ãOutOfMemoryErrorã§è½ã¡ã¾ãã
$ time java -Xmx256M -jar target/large-exel-0.0.1-SNAPSHOT.jar XSSF 100000 xssf.xlsx Exception in thread "main" java.lang.OutOfMemoryError: GC overhead limit exceeded at org.apache.xmlbeans.impl.store.Saver$TextSaver.resize(Saver.java:1700) at org.apache.xmlbeans.impl.store.Saver$TextSaver.preEmit(Saver.java:1303) at org.apache.xmlbeans.impl.store.Saver$TextSaver.emit(Saver.java:1190) at org.apache.xmlbeans.impl.store.Saver$TextSaver.emitElement(Saver.java:962) at org.apache.xmlbeans.impl.store.Saver.processElement(Saver.java:476) at org.apache.xmlbeans.impl.store.Saver.process(Saver.java:307) at org.apache.xmlbeans.impl.store.Saver$TextSaver.saveToString(Saver.java:1864) at org.apache.xmlbeans.impl.store.Cursor._xmlText(Cursor.java:546) at org.apache.xmlbeans.impl.store.Cursor.xmlText(Cursor.java:2436) at org.apache.xmlbeans.impl.values.XmlObjectBase.xmlText(XmlObjectBase.java:1500) at org.apache.poi.xssf.model.SharedStringsTable.getKey(SharedStringsTable.java:134) at org.apache.poi.xssf.model.SharedStringsTable.addEntry(SharedStringsTable.java:180) at org.apache.poi.xssf.usermodel.XSSFCell.setCellValue(XSSFCell.java:350) at org.apache.poi.xssf.usermodel.XSSFCell.setCellValue(XSSFCell.java:320) at ExcelGen.lambda$main$0(ExcelGen.java:37) at ExcelGen$$Lambda$1/868737467.accept(Unknown Source) at java.util.stream.Streams$RangeIntSpliterator.forEachRemaining(Streams.java:110) at java.util.stream.IntPipeline$Head.forEach(IntPipeline.java:557) at ExcelGen.main(ExcelGen.java:34) real 1m48.413s user 2m48.404s sys 0m3.948s
ã§ã¯ãç¶ãã¦SXSSF APIã§å®è¡ã
$ time java -Xmx256M -jar target/large-exel-0.0.1-SNAPSHOT.jar SXSSF 100000 sxssf.xlsx real 0m4.883s user 0m5.312s sys 0m1.276s
XSSF APIã«æ¯ã¹ã¦ããã¡ããã¡ãéãã§ããï¼
ã¡ããã¨Excelãã¡ã¤ã«ãã§ãã¦ãã¾ãã
$ ls -lh sxssf.xlsx -rw-rw-r-- 1 xxxxx xxxxx 1.9M 10æ 11 18:12 sxssf.xlsx
éãã¦ç¢ºèªãããã¨ãã§ãã¾ããå½ç¶ãéãã§ããâ¦ã
ã¨ããã§ããã®æLinuxãªãã/tmpãé
ä¸ï¼ç¨ã¯Javaã®ä¸æãã£ã¬ã¯ããªé
ä¸ï¼ã«å·¨å¤§ãªãã¡ã¤ã«ãæ®ãã¾ãã
â»æ¶ãæ¹ã¯ãå¾è¿°
$ ls -lh /tmp/poi-sxssf-sheet7480297749844044258.xml -rw-rw-r-- 1 xxxxx xxxxx 21M 10æ 11 18:12 /tmp/poi-sxssf-sheet7480297749844044258.xml
å®ã«ãåºæ¥ä¸ãã£ãExcelãã¡ã¤ã«ã®10å以ä¸ã®ãµã¤ãºã®ãã¡ã¤ã«ã§ãã
ã¡ãªã¿ã«ãããã·ã¼ãã®ä¸éãã¡ã¤ã«ã§ãã
ã©ããªã£ã¦ããã ããã¨æã£ã¦ãã¡ãã£ã¨ã½ã¼ã¹ãè¦ãã¦ã¿ã¾ããã
ã¾ããSXSSF APIã£ã¦ãXSSF APIã®ä¸ã«æãç«ã£ã¦ããã§ããã
https://github.com/apache/poi/blob/REL_3_10_1/src/ooxml/java/org/apache/poi/xssf/streaming/SXSSFWorkbook.java#L63
ãããããSXSSFWorkbookã¯ã©ã¹ã®ã¤ã³ã¹ã¿ã³ã¹ãä½æããéã«ãã³ã³ã¹ãã©ã¯ã¿å¼æ°ã«XSSFWorkbookã®ã¤ã³ã¹ã¿ã³ã¹ã渡ããã¨ãã§ãã¾ããã
ã§ãSheetã§ããSXSSSheetã¯ãSheetDataWriterã¨ããWriterãæã£ã¦ãã¾ãã
https://github.com/apache/poi/blob/REL_3_10_1/src/ooxml/java/org/apache/poi/xssf/streaming/SXSSFSheet.java#L59
ãã®SheetDataWriterãããã®ä¸éãã¡ã¤ã«ãæ¸ãåºãã¦ããããã§ãã
https://github.com/apache/poi/blob/REL_3_10_1/src/ooxml/java/org/apache/poi/xssf/streaming/SheetDataWriter.java#L65
ãã©ãã·ã¥ã®å¶å¾¡ããä¸éãã¡ã¤ã«ã®å§ç¸®è¨å®ã¯SXSSFWorkbookã§è¨å®ã§ããã¿ããã§ããã
ã§ãSXSSFWorkbook#write(OutputStream)ãå¼ã³åºããéã«ãå
ã»ã©ã®ã·ã¼ãã®ä¸éãã¡ã¤ã«ããå
¨ä½ã®ãã¡ã¤ã«ãä¸æãã¡ã¤ã«ã¨ãã¦ä½æããããããwriteã¡ã½ããã®å¼æ°ã«ä¸ããOutputStreamã«æçµå½¢å¼ã®ãã¼ã¿ãæ¸ãè¾¼ãã§ããã¨ããå½¢ã«ãªã£ã¦ããããã§ãã
https://github.com/apache/poi/blob/REL_3_10_1/src/ooxml/java/org/apache/poi/xssf/streaming/SheetDataWriter.java#L65
å ¨ä½ã®ä¸æãã¡ã¤ã«ã¯ãåé¤ããã¾ãã
ã·ã¼ãã®ä¸éãã¡ã¤ã«ããã¤æ¶ããã®ãã¨ããã¨ãfinalizeä¾åã ã£ããâ¦ã
https://github.com/apache/poi/blob/REL_3_10_1/src/ooxml/java/org/apache/poi/xssf/streaming/SheetDataWriter.java#L116
追è¨ï¼
ã·ã¼ãã®ä¸éãã¡ã¤ã«ã¯ãWorkbookãcloseï¼POIã®ãã¼ã¸ã§ã³ã«ãã£ã¦ã¯disposeï¼ããã°åé¤ããã¾ãã
ã¾ããæ©è½çã«å¶éã¯ããã¨ããã¨ãã®æ³¨æç¹ã«æ°ãä»ãã¦ããã°ã使ãããã§ãããã
ããããããããªã§ããExcelãä½ããªããã°ããã®ã«â¦ã