Skip to content

Releases: Bin-Huang/NodeSpider

0.9.3

02 Nov 08:02
Compare
Choose a tag to compare
0.9.3 Pre-release
Pre-release

cancel plan's type;
overwrite concurrency;
cancel option rateLimit;
rename option maxConnections to concurrency;
update unit test;
update document;

add support of event

31 Aug 11:36
Compare
Choose a tag to compare
add support of event Pre-release
Pre-release

add support of events:

"empty"

When there aren't more task in the queue, the event "empty" will be emitted.

"queueTask"

When add a new task to the queue, the event "queueTask" will be emitted with a parameter taskObject.

"vacant"

When the queue is empty and all tasks has been done, the event "vacant" will be emitted.

new method `add`, overwrite method `plan`, delete method `pipe`

12 Aug 03:28
Compare
Choose a tag to compare
  • new method add Using method add can add new plan or pipe to the spider instance
  • overwrite method plan Now using method plan can add only default plan, instead of any types plan.
  • delete method pipe If need to add new pipe, use method add
  • better error message in English
  • Translate README.md to English
  • add unit tests for methods add and plan

new pipe: csv-pipe

02 Aug 13:43
Compare
Choose a tag to compare
new pipe: csv-pipe Pre-release
Pre-release

add new pipe: csv-pipe, which can save data in .csv file. It is useful to collect data to analysis.
fix txt-pipe: write header item when it initialize;
update the document;

better method `retry` implementation and the cancel of plan's default info

01 Aug 11:06
Compare
Choose a tag to compare

better method retry implementation In practical it isn't necessary to passed err and current as parameter to finalErrorCallback(which the parameter of method retry) due to function closure. Hence some thing changed:

  • modified method retry
  • removed the member maxRetry from task object and ITask
  • removed the member error from default plan's interface ICurrent

cancel plan's default info The task's info is designed to store the task special information so it can be used in callback as a member of current. With that in mind, the task info should be used in task instead of in plan. Hence the need to cancel the confusing support of plan's default info:

  • modify IPlan, stream plan, function startTask, download plan, default plan

add parameter check to method save;
adapt the document and unit test to the change;

done the parameter check function for spider initialization, fix some bug, add some unit test

28 Jul 03:26
Compare
Choose a tag to compare

done the parameter check function for spider initialization;
fix bug in method filter;
add class Queue to module export;
add unit test for parameter checker when initialize a spider, and pass;
add unit test for function defaultPlan, and pass;
add unit test for methods isExist and filter, and pass;
modify package's description;

fix some bug and pass unit test of task's special info

27 Jul 03:40
Compare
Choose a tag to compare

fix the function startTask, debug of task's special info;
add unit test of task's special info, and pass;
move all of test file to folder test, from folder example;

fix some bug

27 Jul 02:52
Compare
Choose a tag to compare
fix some bug Pre-release
Pre-release

fix the method retry, parameter current should implement ITask;
fix the method queue, task's special option is not supported but special info;
fix the function startTask, when execute plan.process there are an error, throw it;
import uuid/v1 instead of uuid;
delete outdated comments;
add warnning about package's stability;

support to limit connections's number of different plan type, and cancel the support of task's special option

26 Jul 13:47
Compare
Choose a tag to compare

Maybe the support of task's special option is a bad idea: because it will let developer to save more repeated opts in the queue, such as anonymous callback function, which is a waste to memory and is not easy be stored in redis in the future.
consider different needs of different types of plan to manage tasks' maxConne
ctions;
Cancel the support of task's special option, but task's special info will be retained;
delete all of unit test about the support of task's special option.
Class Plan has been deleted and replaced by interface IPlan: all object implemented IPlan will been considered as a plan.
To count the different types' connections, add member currentConnections to the _STATE.
To flexibly limit the max number of connections, the option maxTotalConnections has been deleted, and replaced by maxConnections, which can be a number or a object([key: type]: number of connections).
Members type and info no longer belong to the plan's option, but belong to the plan object;
Only the task implemented ITask will been passed as a parameter to the plan's process function;
rewrite the timer callback and separated into some function;
adapt method plan, retry, queue to the support of different types;
consider different needs of different types of plan to manage tasks' maxConne
ctions;
stored and managed tasks separately according different types of plans in the
Queue;
simiplify the api protocol in interface IQueue, IState and IDefaultOption;
adapt queue's unit test to different types of plan;
modify method getWaitingTaskNum of class Queue;

0.5.7

21 Jul 23:53
Compare
Choose a tag to compare
0.5.7 Pre-release
Pre-release

when calling the method end , close all of pipes in the spider, and emit event end