Skip to content

Latest commit

 

History

History

solr

Folders and files

NameName
Last commit message
Last commit date

parent directory

..
 
 
 
 
 
 

Quick Start

This section describes how to run YCSB on Solr running locally.

1. Set Up YCSB

Clone the YCSB git repository and compile:

git clone git://github.com/brianfrankcooper/YCSB.git
cd YCSB
mvn -pl com.yahoo.ycsb:solr-binding -am clean package

2. Set Up Solr

There must be a running Solr instance with a core/collection pre-defined and configured.

  • See this API reference on how to create a core.
  • See this API reference on how to create a collection in SolrCloud mode.

The conf/schema.xml configuration file present in the core/collection just created must be configured to handle the expected field names during benchmarking. Below illustrates a sample from a schema config file that matches the default field names used by the ycsb client:

<field name="id" type="string" indexed="true" stored="true" required="true" multiValued="false"/>
<field name="field0" type="text_general" indexed="true" stored="true"/>
<field name="field1" type="text_general" indexed="true" stored="true"/>
<field name="field2" type="text_general" indexed="true" stored="true"/>
<field name="field3" type="text_general" indexed="true" stored="true"/>
<field name="field4" type="text_general" indexed="true" stored="true"/>
<field name="field5" type="text_general" indexed="true" stored="true"/>
<field name="field6" type="text_general" indexed="true" stored="true"/>
<field name="field7" type="text_general" indexed="true" stored="true"/>
<field name="field8" type="text_general" indexed="true" stored="true"/>
<field name="field9" type="text_general" indexed="true" stored="true"/>

If running in SolrCloud mode ensure there is an external Zookeeper cluster running.

  • See here for details on how to set up an external Zookeeper cluster.
  • See here for instructions on how to use Zookeeper to manage your core/collection configuration files.

3. Run YCSB

Now you are ready to run! First, load the data:

./bin/ycsb load solr -s -P workloads/workloada -p table=<core/collection name>

Then, run the workload:

./bin/ycsb run solr -s -P workloads/workloada -p table=<core/collection name>

For further configuration see below:

Default Configuration Parameters

The default settings for the Solr node that is created is as follows:

  • solr.cloud

    • A Boolean value indicating if Solr is running in SolrCloud mode. If so there must be an external Zookeeper cluster running also.
    • Default value is false and therefore expects solr to be running in stand-alone mode.
  • solr.base.url

    • The base URL in which to interface with a running Solr instance in stand-alone mode
    • Default value is `http://localhost:8983/solr
  • solr.commit.within.time

    • The max time in ms to wait for a commit when in batch mode, ignored otherwise
    • Default value is 1000ms
  • solr.batch.mode

    • Indicates if inserts/updates/deletes should be commited in batches (frequency controlled by the solr.commit.within.time parameter) or commit 1 document at a time.
    • Default value is false
  • solr.zookeeper.hosts

    • A list of comma seperated host:port pairs of Zookeeper nodes used to manage SolrCloud configurations.
    • Must be passed when in SolrCloud mode.
    • Default value is localhost:2181

Custom Configuration

If you wish to customize the settings used to create the Solr node you can created a new property file that contains your desired Solr node settings and pass it in via the parameter to 'bin/ycsb' script. Note that the default properties will be kept if you don't explicitly overwrite them.

Assuming that we have a properties file named "myproperties.data" that contains custom Solr node configuration you can execute the following to pass it into the Solr client:

./bin/ycsb run solr -P workloads/workloada -P myproperties.data -s

If you wish to use SolrCloud mode ensure a Solr cluster is running with an external zookeeper cluster and an appropriate collection has been created. Make sure to pass the following properties as parameters to 'bin/ycsb' script.

solr.cloud=true
solr.zookeeper.hosts=<zkHost2>:<zkPort1>,...,<zkHostN>:<zkPortN>