Skip to content
This repository was archived by the owner on Jan 7, 2024. It is now read-only.

Commit 2639edd

Browse files
committed
Merge branch 'master' into column-position-lattice
2 parents 5cdb22f + 4aa9568 commit 2639edd

24 files changed

Lines changed: 287 additions & 302 deletions

.travis.yml

Lines changed: 4 additions & 6 deletions
Original file line numberDiff line numberDiff line change
@@ -2,10 +2,8 @@ language: java
22
install: mvn install -DskipTests=true -Dmaven.javadoc.skip=true -Dgpg.skip=true -B -V
33
script: mvn test -Dgpg.skip=true
44
jdk:
5-
- openjdk7
6-
- oraclejdk8
7-
- oraclejdk9
5+
- openjdk8
6+
- openjdk9
7+
- openjdk10
8+
- openjdk11
89
sudo: false
9-
10-
11-

README.md

Lines changed: 17 additions & 15 deletions
Original file line numberDiff line numberDiff line change
@@ -1,4 +1,4 @@
1-
tabula-java [![Build Status](https://travis-ci.org/tabulapdf/tabula-java.svg?branch=master)](https://travis-ci.org/tabulapdf/tabula-java) [![Build status](https://ci.appveyor.com/api/projects/status/l5gym1mjhrd2v8yn?svg=true)](https://ci.appveyor.com/project/jazzido/tabula-java) [![Join the chat at https://gitter.im/tabulapdf/tabula-java](https://badges.gitter.im/Join%20Chat.svg)](https://gitter.im/tabulapdf/tabula-java?utm_source=badge&utm_medium=badge&utm_campaign=pr-badge&utm_content=badge)
1+
tabula-java [![Build Status](https://travis-ci.org/tabulapdf/tabula-java.svg?branch=master)](https://travis-ci.org/tabulapdf/tabula-java) [![Build status](https://ci.appveyor.com/api/projects/status/l5gym1mjhrd2v8yn?svg=true)](https://ci.appveyor.com/project/jazzido/tabula-java)
22
===========
33

44
`tabula-java` is a library for extracting tables from PDF files — it is the table extraction engine that powers [Tabula](http://tabula.technology/) ([repo](http://github.com/tabulapdf/tabula)). You can use `tabula-java` as a command-line tool to programmatically extract tables from PDFs.
@@ -16,25 +16,27 @@ Download a version of the tabula-java's jar, with all dependencies included, tha
1616
`tabula-java` provides a command line application:
1717

1818
```
19-
$ java -jar target/tabula-1.0.1-jar-with-dependencies.jar --help
19+
$ java -jar target/tabula-1.0.2-jar-with-dependencies.jar --help
2020
usage: tabula [-a <AREA>] [-b <DIRECTORY>] [-c <COLUMNS>] [-d] [-f
2121
<FORMAT>] [-g] [-h] [-i] [-l] [-n] [-o <OUTFILE>] [-p <PAGES>] [-r]
2222
[-s <PASSWORD>] [-t] [-u] [-v]
2323
2424
Tabula helps you extract tables from PDFs
2525
26-
-a,--area <AREA> Portion of the page to analyze. Accepts top,
27-
left,bottom,right.
28-
Example: --area 269.875,12.75,790.5,561.
29-
If all values are between 0-100 (inclusive)
30-
and preceded by '%', input will be taken as
31-
% of actual height or width of the page.
32-
Example: --area %0,0,100,50.
33-
To specify multiple areas, -a option should
34-
be repeated. Default is entire page
26+
-a,--area <AREA> Portion of the page to analyze. Example: --area
27+
269.875,12.75,790.5,561. Accepts
28+
top,left,bottom,right i.e. y1,x1,y2,x2 where all
29+
values are in points relative to the top left
30+
corner. If all values are between 0-100
31+
(inclusive) and preceded by '%', input will be
32+
taken as % of actual height or width of the page.
33+
Example: --area %0,0,100,50. To specify multiple
34+
areas, -a option should be repeated. Default is
35+
entire page
3536
-b,--batch <DIRECTORY> Convert all .pdfs in the provided directory.
36-
-c,--columns <COLUMNS> X coordinates of column boundaries. Example
37-
--columns 10.1,20.2,30.3
37+
-c,--columns <COLUMNS> X coordinates of column boundaries where values
38+
are in points and relative to the left of the
39+
page. Example --columns 10.1,20.2,30.3
3840
-d,--debug Print detected table areas instead of
3941
processing.
4042
-f,--format <FORMAT> Output format: (CSV,TSV,JSON). Default: CSV
@@ -69,7 +71,7 @@ Tabula helps you extract tables from PDFs
6971
-v,--version Print version and exit.
7072
```
7173

72-
It also includes a debugging tool, run `java -cp ./target/tabula-1.0.1-jar-with-dependencies.jar technology.tabula.debug.Debug -h` for the available options.
74+
It also includes a debugging tool, run `java -cp ./target/tabula-1.0.2-jar-with-dependencies.jar technology.tabula.debug.Debug -h` for the available options.
7375

7476
You can also integrate `tabula-java` with any JVM language. For Java examples, see the [`tests`](src/test/java/technology/tabula/) folder.
7577

@@ -101,7 +103,7 @@ You can help by:
101103

102104
### Backers
103105

104-
You can also support our continued work on `tabula-java` with a one-time or monthly donation [on OpenCollective](https://opencollective.com/tabulapdf#support). Organizations who use `tabula-java` can also [sponsor the project](https://opencollective.com/tabulapdf#support) for acknolwedgement on [our official site](http://tabula.technology/) and this README.
106+
You can also support our continued work on `tabula-java` with a one-time or monthly donation [on OpenCollective](https://opencollective.com/tabulapdf#support). Organizations who use `tabula-java` can also [sponsor the project](https://opencollective.com/tabulapdf#support) for acknowledgement on [our official site](http://tabula.technology/) and this README.
105107

106108
Special thanks to the following users and organizations for generously supporting Tabula with donations and grants:
107109

appveyor.yml

Lines changed: 3 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -2,14 +2,14 @@ version: '{build}'
22
install:
33
- ps: |
44
Add-Type -AssemblyName System.IO.Compression.FileSystem
5-
if (!(Test-Path -Path "C:\maven\apache-maven-3.5.2" )) {
5+
if (!(Test-Path -Path "C:\maven\apache-maven-3.5.4" )) {
66
(new-object System.Net.WebClient).DownloadFile(
7-
'http://www-us.apache.org/dist/maven/maven-3/3.5.2/binaries/apache-maven-3.5.2-bin.zip',
7+
'http://www-us.apache.org/dist/maven/maven-3/3.5.4/binaries/apache-maven-3.5.4-bin.zip',
88
'C:\maven-bin.zip'
99
)
1010
[System.IO.Compression.ZipFile]::ExtractToDirectory("C:\maven-bin.zip", "C:\maven")
1111
}
12-
- cmd: SET PATH=C:\maven\apache-maven-3.5.2\bin;%JAVA_HOME%\bin;%PATH%
12+
- cmd: SET PATH=C:\maven\apache-maven-3.5.4\bin;%JAVA_HOME%\bin;%PATH%
1313
- cmd: SET MAVEN_OPTS=-Xmx2g
1414
- cmd: SET JAVA_OPTS=-Xmx2g
1515
build_script:

pom.xml

Lines changed: 29 additions & 37 deletions
Original file line numberDiff line numberDiff line change
@@ -2,7 +2,7 @@
22
<modelVersion>4.0.0</modelVersion>
33
<groupId>technology.tabula</groupId>
44
<artifactId>tabula</artifactId>
5-
<version>1.0.2-SNAPSHOT</version>
5+
<version>1.0.4-SNAPSHOT</version>
66
<name>Tabula</name>
77
<description>Extract tables from PDF files</description>
88
<url>http://github.com/tabulapdf/tabula-java</url>
@@ -36,17 +36,9 @@
3636
<connection>scm:git:[email protected]:tabulapdf/tabula-java.git</connection>
3737
<developerConnection>scm:git:[email protected]:tabulapdf/tabula-java.git</developerConnection>
3838
<url>[email protected]:tabulapdf/tabula-java.git</url>
39-
<tag>tabula-1.0.0-SNAPSHOT</tag>
39+
<tag>v1.0.2</tag>
4040
</scm>
4141

42-
<repositories>
43-
<repository>
44-
<id>sonatype</id>
45-
<name>Sonatype repository</name>
46-
<url>https://oss.sonatype.org/content/repositories/snapshots/</url>
47-
</repository>
48-
</repositories>
49-
5042
<properties>
5143
<project.build.sourceEncoding>UTF-8</project.build.sourceEncoding>
5244
<project.reporting.outputEncoding>UTF-8</project.reporting.outputEncoding>
@@ -68,7 +60,7 @@
6860
<plugin>
6961
<groupId>org.apache.maven.plugins</groupId>
7062
<artifactId>maven-javadoc-plugin</artifactId>
71-
<version>2.10.3</version>
63+
<version>3.2.0</version>
7264
<configuration>
7365
<skip>true</skip>
7466
</configuration>
@@ -81,7 +73,7 @@
8173
<plugin>
8274
<groupId>org.sonatype.plugins</groupId>
8375
<artifactId>nexus-staging-maven-plugin</artifactId>
84-
<version>1.6.3</version>
76+
<version>1.6.8</version>
8577
<extensions>true</extensions>
8678
<configuration>
8779
<serverId>ossrh</serverId>
@@ -93,7 +85,7 @@
9385
<plugin>
9486
<groupId>org.apache.maven.plugins</groupId>
9587
<artifactId>maven-source-plugin</artifactId>
96-
<version>2.2.1</version>
88+
<version>3.2.1</version>
9789
<executions>
9890
<execution>
9991
<id>attach-sources</id>
@@ -106,7 +98,7 @@
10698
<plugin>
10799
<groupId>org.apache.maven.plugins</groupId>
108100
<artifactId>maven-javadoc-plugin</artifactId>
109-
<version>2.9.1</version>
101+
<version>3.2.0</version>
110102
<executions>
111103
<execution>
112104
<id>attach-javadocs</id>
@@ -119,7 +111,7 @@
119111
<plugin>
120112
<groupId>org.apache.maven.plugins</groupId>
121113
<artifactId>maven-gpg-plugin</artifactId>
122-
<version>1.5</version>
114+
<version>1.6</version>
123115
<executions>
124116
<execution>
125117
<id>sign-artifacts</id>
@@ -132,10 +124,10 @@
132124
</plugin>
133125
<plugin>
134126
<artifactId>maven-compiler-plugin</artifactId>
135-
<version>3.1</version>
127+
<version>3.8.1</version>
136128
<configuration>
137-
<source>1.7</source>
138-
<target>1.7</target>
129+
<source>1.8</source>
130+
<target>1.8</target>
139131
</configuration>
140132
</plugin>
141133
<plugin>
@@ -154,7 +146,7 @@
154146
<plugin>
155147
<groupId>org.apache.maven.plugins</groupId>
156148
<artifactId>maven-surefire-plugin</artifactId>
157-
<version>2.20.1</version>
149+
<version>2.22.2</version>
158150
<configuration>
159151
<!-- Travis build workaround -->
160152
<argLine>-Xms1024m -Xmx2048m</argLine>
@@ -181,7 +173,7 @@
181173
<plugin>
182174
<groupId>org.apache.maven.plugins</groupId>
183175
<artifactId>maven-javadoc-plugin</artifactId>
184-
<version>2.9.1</version>
176+
<version>3.2.0</version>
185177
<executions>
186178
<execution>
187179
<id>attach-javadocs</id>
@@ -194,7 +186,7 @@
194186
<plugin>
195187
<groupId>org.apache.maven.plugins</groupId>
196188
<artifactId>maven-source-plugin</artifactId>
197-
<version>2.2.1</version>
189+
<version>3.2.1</version>
198190
<executions>
199191
<execution>
200192
<id>attach-sources</id>
@@ -207,7 +199,7 @@
207199
<plugin>
208200
<groupId>org.apache.maven.plugins</groupId>
209201
<artifactId>maven-gpg-plugin</artifactId>
210-
<version>1.5</version>
202+
<version>1.6</version>
211203
<executions>
212204
<execution>
213205
<id>sign-artifacts</id>
@@ -225,45 +217,45 @@
225217

226218
<dependencies>
227219
<dependency>
228-
<groupId>net.sf.jsi</groupId>
229-
<artifactId>jsi</artifactId>
230-
<version>1.1.0-SNAPSHOT</version>
220+
<groupId>org.locationtech.jts</groupId>
221+
<artifactId>jts-core</artifactId>
222+
<version>1.17.0</version>
231223
</dependency>
232224

233225
<dependency>
234226
<groupId>org.slf4j</groupId>
235227
<artifactId>slf4j-api</artifactId>
236-
<version>1.7.25</version>
228+
<version>1.7.30</version>
237229
</dependency>
238230

239231
<dependency>
240232
<groupId>org.slf4j</groupId>
241233
<artifactId>slf4j-simple</artifactId>
242-
<version>1.7.25</version>
234+
<version>1.7.30</version>
243235
</dependency>
244236

245237
<dependency>
246238
<groupId>org.apache.pdfbox</groupId>
247239
<artifactId>pdfbox</artifactId>
248-
<version>2.0.8</version>
240+
<version>2.0.15</version>
249241
</dependency>
250242

251243
<dependency>
252244
<groupId>org.bouncycastle</groupId>
253245
<artifactId>bcprov-jdk15on</artifactId>
254-
<version>1.56</version>
246+
<version>1.66</version>
255247
</dependency>
256248

257249
<dependency>
258250
<groupId>org.bouncycastle</groupId>
259251
<artifactId>bcmail-jdk15on</artifactId>
260-
<version>1.56</version>
252+
<version>1.66</version>
261253
</dependency>
262254

263255
<dependency>
264256
<groupId>junit</groupId>
265257
<artifactId>junit</artifactId>
266-
<version>4.11</version>
258+
<version>4.13</version>
267259
<scope>test</scope>
268260
</dependency>
269261

@@ -276,19 +268,19 @@
276268
<dependency>
277269
<groupId>org.apache.commons</groupId>
278270
<artifactId>commons-csv</artifactId>
279-
<version>1.4</version>
271+
<version>1.8</version>
280272
</dependency>
281273

282274
<dependency>
283275
<groupId>com.google.code.gson</groupId>
284276
<artifactId>gson</artifactId>
285-
<version>2.8.0</version>
277+
<version>2.8.6</version>
286278
</dependency>
287279

288280
<dependency>
289281
<groupId>com.github.jai-imageio</groupId>
290282
<artifactId>jai-imageio-core</artifactId>
291-
<version>1.3.1</version>
283+
<version>1.4.0</version>
292284
</dependency>
293285

294286
<dependency>
@@ -298,9 +290,9 @@
298290
</dependency>
299291

300292
<dependency>
301-
<groupId>com.levigo.jbig2</groupId>
302-
<artifactId>levigo-jbig2-imageio</artifactId>
303-
<version>2.0</version>
293+
<groupId>org.apache.pdfbox</groupId>
294+
<artifactId>jbig2-imageio</artifactId>
295+
<version>3.0.3</version>
304296
</dependency>
305297
</dependencies>
306298

0 commit comments

Comments
 (0)