xlsxsql

package module
v0.4.0 Latest Latest
Warning

This package is not in the latest version of its module.

Go to latest
Published: Dec 17, 2023 License: MIT Imports: 7 Imported by: 0

README

xlsxsql

PkgGoDev Actions Status

A CLI tool that executes SQL queries on various files including xlsx files and outputs the results to various files.

id name price
1 apple 100
2 orange 50
3 melon 500

A CLI tool that executes SQL queries on xlsx files and outputs the results to various files, and also executes SQL queries on various files and outputs them to xlsx files. Built using excelize and trdsql.

Install

Go install
go install github.com/noborus/xlsxsql/cmd/xlsxsql@latest
Homebrew

You can install Homebrew's xlsxsql with the following command:

brew install noborus/tap/xlsxsql
Binary Downloads

Precompiled binaries for xlsxsql are available for various platforms and architectures. You can download them from the GitHub Releases page.

The following binaries can be downloaded from release.

  • Darwin_arm64
  • Darwin_x86_64
  • Linux_arm64
  • Linux_i386
  • Linux_x86_64
  • Windows_arm64
  • Windows_x86_64

To install a binary, download the appropriate file for your system, extract it, and place the xlsxsql executable in a directory included in your system's PATH.

For example, on a Unix-like system, you might do:

tar xvf xlsxsql_Darwin_x86_64.tar.gz
mv xlsxsql /usr/local/bin/

Usage

$ xlsxsql --help
Execute SQL against xlsx file.
Output to CSV and various formats.

Usage:
  xlsxsql [flags]
  xlsxsql [command]

Available Commands:
  completion  Generate the autocompletion script for the specified shell
  help        Help about any command
  list        List the sheets of the xlsx file
  query       Executes the specified SQL query against the xlsx file
  table       SQL(SELECT * FROM table) for xlsx

Flags:
      --clear-sheet        Clear sheet when outputting to xlsx file
      --debug              debug mode
  -H, --header             Input header
  -h, --help               help for xlsxsql
  -o, --out string         Output Format[CSV|AT|LTSV|JSON|JSONL|TBLN|RAW|MD|VF|YAML|XLSX] (default "GUESS")
      --out-cell string    Cell name to output to xlsx file
  -O, --out-file string    File name to output to file
      --out-header         Output header
      --out-sheet string   Sheet name to output to xlsx file
  -s, --skip int           Skip the number of lines
  -v, --version            display version information

Use "xlsxsql [command] --help" for more information about a command.
List sheets
$ xlsxsql list test.xlsx
Sheet1
Sheet2
Basic usage

The basic usage of xlsxsql is to run a SQL query against an Excel file. The query command is used followed by the SQL query in quotes. The SQL query should include the name of the Excel file. If no sheet is specified, the first sheet will be targeted.

xlsxsql query "SELECT * FROM test.xlsx"

For example, if test.xlsx contains the following data in its first sheet:

Name Age
Alice 20
Bob 25
Carol 30

The output will be:

Name,Age
Alice,20
Bob,25
Carol,30

xlsxsql is an extended version of trdsql, so you can execute SQL on files such as CSV and JSON.

xlsxsql query "SELECT * FROM test.csv"

In other words, you can also do CSV and JOIN.

xlsxsql query -H -o md \
"SELECT a.id,a.name,b.price 
  FROM testdata/test3.xlsx::.C1 AS a
  LEFT JOIN test.csv AS b 
    ON a.id=b.id"
Specify sheet

The sheet can be specified by using a double colon "::" after the file name (the first sheet is selected by default if not specified).

xlsxsql query "SELECT * FROM test.xlsx::Sheet2"
Specify cell

Cell can be specified by using a dot "." after the sheet.

xlsxsql query "SELECT * FROM test3.xlsx::Sheet1.C1"

Optional if the sheet is the first sheet.

xlsxsql query "SELECT * FROM test3.xlsx::.C1"

[!NOTE] If cell is specified, the table up to the blank column is considered to be the table. ​ This allows multiple tables to be specified on one sheet, and JOIN is also possible.

xlsxsql query -H -o md \
"SELECT a.id,a.name,b.price 
  FROM testdata/test3.xlsx::.C1 AS a
  LEFT JOIN testdata/test3.xlsx::.F4 AS b 
    ON a.id=b.id"
Shorthand designation

The table command is a shorthand that allows you to quickly display the contents of a specified sheet in a table format. The syntax is xlsxsql table <filename>::<sheetname>.<cellname>. If no sheet name is specified, the first sheet of the Excel file will be targeted.

Here is an example:

xlsxsql table test.xlsx::Sheet2.C1

It can be omitted for the first sheet.

xlsxsql table test.xlsx::.C1
Skip Options

The --skip or -s option skips the specified number of lines. For example, you would use it like this:

xlsxsql query --skip 1 "SELECT * FROM test.xlsx::Sheet2"

Skip is useful when specifying sheets, allowing you to skip unnecessary rows. (There seems to be no advantage to using skip when specifying Cell.)

Output format
xlsxsql query --out JSONL "SELECT * FROM test.xlsx::Sheet2"

You can choose from CSV, LTSV, JSON, JSONL, TBLN, RAW, MD, VF, YAML, (XLSX).

Output to xlsx file

You can output the result to an xlsx file by specifying a file name with the .xlsx extension as the --out-file option. For example:

xlsxsql query --out-file test2.xlsx "SELECT * FROM test.xlsx::Sheet2"

[!NOTE] You can also output to the same xlsx file as the input file. Please be careful as the contents will be overwritten.

[!NOTE] Even if you specify XLSX with --out, you must specify a file name with the extension .xlsx.

This command will execute the SQL query on the Sheet1 of test.xlsx and output the result to result.xlsx. If the file does not exist, it will be created. If the file already exists, the results will be updated.

You can specify the sheet and cell to output, if you want to output to an xlsx file. For example:

xlsxsql query --out-file test2.xlsx --out-sheet Sheet2 --out-cell C1 "SELECT * FROM test.xlsx::Sheet2"

You can clear the sheet before outputting to an xlsx file by specifying the --clear-sheet option. For example:

xlsxsql query --out-file test2.xlsx --clear-sheet "SELECT * FROM test.xlsx::Sheet2"
Multiple queries

It is also possible to output after executing an update query. A SELECT query is required for output.

xlsxsql query --header --out-header --out-file test.xlsx --out-sheet Sheet2 \
"UPDATE test.xlsx SET Age=Age+1 WHERE Name='Alice';
 SELECT * FROM test.xlsx"

Documentation

Overview

Package xlsxsql provides a reader for XLSX files. It uses the trdsql and excelize/v2 packages to read XLSX files and convert them into SQL tables. The main type is XLSXReader, which implements the trdsql.Reader interface.

Index

Constants

This section is empty.

Variables

View Source
var (
	ErrSheetNotFound = fmt.Errorf("sheet not found")
	ErrNoData        = fmt.Errorf("no data")
)
View Source
var ErrInvalidFileName = errors.New("file name must end with .xlsx")

Functions

func NewXLSXReader

func NewXLSXReader(reader io.Reader, opts *trdsql.ReadOpts) (trdsql.Reader, error)

NewXLSXReader function takes an io.Reader and trdsql.ReadOpts, and returns a new XLSXReader. It reads the XLSX file, retrieves the sheet specified by the InJQuery option, and reads the rows into the XLSXReader.

func XLSXSheet

func XLSXSheet(fileName string) ([]string, error)

XLSXSheet returns the sheet name of the XLSX file.

Types

type WriteOpt added in v0.3.0

type WriteOpt func(*WriteOpts)

WriteOpt is a function to set WriteOpts.

func Cell added in v0.3.0

func Cell(f string) WriteOpt

Cell sets the cell name.

func ClearSheet added in v0.3.0

func ClearSheet(f bool) WriteOpt

ClearSheet sets the flag to clear the sheet.

func ErrStream added in v0.3.0

func ErrStream(f io.Writer) WriteOpt

ErrStream sets the error output destination.

func FileName added in v0.3.0

func FileName(f string) WriteOpt

FileName sets the output file name.

func Header(f bool) WriteOpt

func Sheet added in v0.3.0

func Sheet(f string) WriteOpt

Sheet sets the sheet name.

type WriteOpts added in v0.3.0

type WriteOpts struct {
	// ErrStream is the error output destination.
	ErrStream io.Writer
	// FileName is the output file name.
	FileName string
	// Sheet is the sheet name.
	Sheet string
	// Cell is the cell name.
	Cell string
	// ClearSheet is the flag to clear the sheet.
	ClearSheet bool
	// WriteHeader is the flag to write the header.
	Header bool
}

WriteOpts represents options that determine the behavior of the writer.

type XLSXReader

type XLSXReader struct {
	// contains filtered or unexported fields
}

XLSXReader is a reader for XLSX files.

func (XLSXReader) Names

func (r XLSXReader) Names() ([]string, error)

Names returns the column names of the XLSX file.

func (XLSXReader) PreReadRow

func (r XLSXReader) PreReadRow() [][]any

PreReadRow returns the rows of the XLSX file.

func (XLSXReader) ReadRow

func (r XLSXReader) ReadRow(row []any) ([]any, error)

ReadRow only returns EOF.

func (XLSXReader) Types

func (r XLSXReader) Types() ([]string, error)

Types returns the column types of the XLSX file.

type XLSXWriter added in v0.3.0

type XLSXWriter struct {
	// contains filtered or unexported fields
}

XLSXWriter is a writer for XLSX files.

func NewXLSXWriter added in v0.3.0

func NewXLSXWriter(options ...WriteOpt) (*XLSXWriter, error)

NewXLSXWriter function takes an io.Writer and trdsql.WriteOpts, and returns a new XLSXWriter.

func (*XLSXWriter) PostWrite added in v0.3.0

func (w *XLSXWriter) PostWrite() error

PostWrite function closes the XLSXWriter.

func (*XLSXWriter) PreWrite added in v0.3.0

func (w *XLSXWriter) PreWrite(columns []string, types []string) error

PreWrite function opens the XLSXWriter.

func (*XLSXWriter) WriteRow added in v0.3.0

func (w *XLSXWriter) WriteRow(row []interface{}, columns []string) error

WriteRow function writes a row to the XLSXWriter.

Directories

Path Synopsis
cmd

Jump to

Keyboard shortcuts

? : This menu
/ : Search site
f or F : Jump to
y or Y : Canonical URL