Skip to content

Commit 8682454

Browse files
authored
Merge pull request #8 from LaurentGH/search
New Search page with detailled explanations
2 parents ed29812 + d1bb0ec commit 8682454

File tree

1 file changed

+154
-13
lines changed

1 file changed

+154
-13
lines changed

docs/Search.md

Lines changed: 154 additions & 13 deletions
Original file line numberDiff line numberDiff line change
@@ -5,21 +5,162 @@ parent: Features
55
---
66

77
{: .note }
8-
> This only applies to built-in search, other search plugins like may override this syntax.
8+
> This only applies to the built-in search. Search plugins may override this syntax.
99
10-
Search query consists of several keywords. Keyword starting with "-" is considered a negative match. Several special keywords are available:
10+
A search query consists of one or several keywords.
1111

12-
* ``@{date}`` - match by date. For example, @yesterday or @2011-11-03. Please note that due to incomplete implementation, special date keywords like yesterday might not match all articles if user timezone is different from tt-rss internal timezone (UTC).
13-
* ``pub:{true,false}`` - match only published or unpublished articles
14-
* ``star:{true, false}`` - same, starred articles
15-
* ``unread:{true, false}`` - self explanatory (requires trunk as of 05.03.2015)
16-
* ``note:{true, false, sometext}`` - same, for articles having an attached note or matching the specified text
17-
* ``label:Somelabel`` - articles that belong to a specified label
18-
* ``tag:mytag`` - articles which have specified tag
19-
* ``title:``, ``author:`` - self explanatory
12+
<a id="text-keyword"></a>A keyword can be a **text keyword**. A text keyword is a single word such as `ocean`, or successive words enclosed in quotes such as `"pacific ocean"`. These keywords are searched using the PostgreSQL [Full Text Search](#full-text-search) engine. This engine supports [word stemming](#word-stemming), and [logical operators](#logical-operators).
2013

21-
When searching by keyword with spaces, use quotes like this: `"title:string with spaces"` or `tag:"multiple words"`
14+
<a id="namevalue-keyword"></a>A keyword can also be a **name-value keyword**:
15+
- `star:true`, `star:false` - match starred or not starred articles
16+
- `unread:true`, `unread:false` - match unread or read articles
17+
- `pub:true`, `pub:false` - match published or unpublished articles
18+
- `title:sometext`, `title:"two words"` - match articles with a title containing the specified text (sub-string match)
19+
- `author:sometext`, `author:"two words"` - match articles with an author containing the specified text (sub-string match)
20+
- `note:true`, `note:false`, `note:sometext`, `note:"two words"` - match articles with a note, no note, or a note containing the specified text (sub-string match)
21+
- `label:true`, `label:false`, `label:somelabel`, `label:"two words"` - match articles with a label, no label, or belonging to the specified label (exact-string match)
22+
- `tag:true`, `tag:false`, `tag:sometag`, `tag:"two words"` - match articles with a tag, no tag, or associated to the specified tag (exact-string match)
2223

23-
If no special keywords are specified, search is done using PostgreSQL [Full Text Search](https://www.postgresql.org/docs/current/textsearch-intro.html) engine.
24+
A keyword can also be a **date keyword**:
25+
- `@somedate` - match by article publication [date](#date_keyword)
2426

25-
Pointless as it may be, you can combine negative prefix with the special keywords: `-star:true` would essentially mean `star:false`.
27+
A keyword starting with `-` (negative sign) is considered a negative match. The `-` can be applied before any type of keyword. For example `-unwanted`, `-"unwanted words"`, `-title:unwanted`, `-tag:"unwanted words"` or `-@yesterday`.
28+
29+
A logical `AND` operator is applied between keywords. For example `ocean "tree flower" note:true -title:"orange color"` searches for articles containing the word _ocean_ (with [stemming](#word-stemming)) AND the phrase _"tree flower"_ (with [stemming](#word-stemming)) AND a note AND a title not containing the string _"orange color"_. This _AND_ must not be written, it is applied by default.
30+
31+
Other [logical operators](#logical-operators) are only supported around a [text keyword](#text-keyword), because they are processed by PostgreSQL [Full Text Search](#full-text-search) engine. A [name-value keyword](#namevalue-keyword) or [date keyword](#date_keyword) does not support those PostgreSQL [logical operators](#logical-operators).
32+
33+
34+
<a id="full-text-search"></a>
35+
## PostgreSQL Full Text Search
36+
37+
Tiny Tiny RSS uses a PostgreSQL database, providing a [Full Text Search engine](https://www.postgresql.org/docs/current/textsearch-intro.html) (external link).
38+
39+
It supports two main features:
40+
- [Word stemming](#word-stemming)
41+
- [Logical operators](#logical-operators)
42+
43+
<a id="word-stemming"></a>
44+
### Word stemming
45+
46+
Word stemming is a process to find the stem (root) of a word. For example, in English the words _security_, _secure_ and _secured_ all share the same stem: _secur_. PostgreSQL names this stem a _lexeme_. A _lexeme_ is a normalized string so that different forms of the same word are made alike.
47+
48+
Word stemming is only available for [text keywords](#text-keyword).
49+
50+
Here is a full example. A RSS feed provides an article containing the word _security_. If the user has configured the language of this feed as English, then this word _security_ is stored in the database as its lexeme _secur_.
51+
Later, the user opens the search form, selects the English language, and searches for _secured_. Tiny Tiny RSS sends _secured_ to PostgreSQL, which converts this query to the lexeme _secur_. As both lexemes are identical, the article containing _security_ matches the search query _secured_.
52+
53+
Word stemming is powerful, but has one drawback: both languages of the feed and of the search query have to be well configured. Indeed, the word stemming process depends on the language: French and English words are not stemmed in the same way, so comparing them may lead to unexpected results.
54+
55+
In Tiny Tiny RSS there is a special language named _Simple_.. Word stemming in the _Simple_ language is almost equivalent to exact string matching. With the _Simple_ language, only punctuation such as commas are removed. The power of word stemming is thus not applied, but it works well in usages with multiple languages.
56+
57+
The user can also manually set a prefix in the search query using the syntax `secu:*` which matches every word starting by `secu`.
58+
59+
<a id="logical-operators"></a>
60+
### Logical operators
61+
62+
A [text keyword](#text-keyword) can be surrounded by logical operators provided by the PostgreSQL [Full Text Search](#full-text-search) engine.
63+
64+
{: .note }
65+
> Due to current parser limitations, these logical operators cannot be applied on a [name-value keyword](#namevalue-keyword) nor on a [date keyword](#date_keyword).
66+
67+
PostgreSQL provides:
68+
- `!` : logical NOT
69+
- `&` : logical AND
70+
- `|` : logical OR
71+
- `(` and `)` : parentheses can be used to control nesting of operators. Without parentheses, `|` binds least tightly, then `&`, and `!` most tightly.
72+
73+
For example: `ocean & ( ( pacific | atlantic ) & ! "black sea" )`
74+
75+
{: warning }
76+
> Due to current parser limitations, the handling of space is important:
77+
> - Spaces are **required around words enclosed in quotes** such as `"black sea"`, otherwise the parser does not detect the quotes.
78+
> - Spaces are **recommended around single words** such as `atlantic`, otherwise the word is not highlighted (please also see [highlighting limitations](#highlighting_limitations)).
79+
> - Spaces are **recommended around logical operators**, otherwise highlighting may not work correctly.
80+
81+
{: warning }
82+
> Due to current parser limitations, when at least one operator is detected Tiny Tiny RSS does not apply the default _AND_ operator. Tiny Tiny RSS expects the whole query to be well formatted. For example the query `one two` works because no operator is detected, so Tiny Tiny RSS adds the _AND_.
83+
> However, `one two & three` fails because Tiny Tiny RSS detects the `&` operator, so expects the whole query to be well formatted, and does not add the missing `&` between the words `one` and `two`.
84+
85+
{: .note }
86+
> When a search query contains [name-value](#namevalue-keyword)/[date](#date_keyword) keywords and [text keywords](#text-keyword) using _logical operators_, it is recommended to write the [text keywords](#text-keyword) at the end (or the beginning), and to surround them with parentheses.
87+
> For example when reading `-title:submarine @yesterday ( pacific | atlantic )` one can easily understand that the parentheses contains a complex fragment that has to be well formatted with no missing operator.
88+
89+
{: .note }
90+
> Due to current parser limitations, the `-` negation does not work before a parenthesis (only before a [text keyword](#text-keyword)). When a parentheses group needs to be negated, use the `!` operator. For example: `-title:submarine @yesterday ( ! ( pacific | atlantic ) )`
91+
92+
93+
<a id="date_keyword"></a>
94+
## Date keyword
95+
96+
A _date keyword_ can filter articles based on their publication or update date:
97+
- `@2025-10-28` formatted as `@YYYY-MM-DD`
98+
- `@2025/10/28` formatted as `@YYYY/MM/DD`
99+
- `@28/10/2025` formatted as `@DD/MM/YYYY`
100+
- `@"28 oct 2025"`
101+
- `@"October 28"` (of the current year)
102+
- `@today`
103+
- `@yesterday`
104+
- `@"2 days ago"`
105+
- `@"last monday"`
106+
107+
{: .note }
108+
> A _date keyword_ has to represent a fixed day. For example `@"last week"`, `@2023-11` or `@2024` cannot be used because they represent a range of several days.
109+
110+
111+
<a id="quoting_variants"></a>
112+
## Quoting variants
113+
114+
When a [text keyword](#text-keyword) contains spaces, and is negated, it can be written in two ways:
115+
- `-"pacific ocean"` - the recommended usage because it is more readable
116+
- `"-pacific ocean"`
117+
118+
When a [name-value keyword](#namevalue-keyword) contains spaces, it can be written in two ways:
119+
- `title:"two words"` - the recommended usage because it is more readable
120+
- `"title:two words"`
121+
122+
The negative expression can be written in two ways:
123+
- `-title:"two words"` - the recommended usage because it is more readable
124+
- `"-title:two words"`
125+
126+
When a [date keyword](#date_keyword) contains spaces, it can be written in two ways:
127+
- `@"two words"` - the recommended usage because it is more readable
128+
- `"@two words"`
129+
130+
The negative expression can be written in two ways:
131+
- `-@"two words"` - the recommended usage because it is more readable
132+
- `"-@two words"`
133+
134+
<a id="highlighting_limitations"></a>
135+
## Highlighting limitations
136+
137+
A [text keyword](#text-keyword) can be a single word such as `ocean`. If the article contains `ocean` or `oceanographer`, the `ocean` fragment is highlighted.
138+
A [text keyword](#text-keyword) can also be successive words enclosed in quotes such as `"pacific ocean"`. If the article contains `pacific oceanographer`, the `pacific ocean` fragment is highlighted.
139+
140+
{: .note }
141+
> Due to incomplete implementation, the word is not correctly highlighted when a [stemmed](#word-stemming) variant is used. For example, if the user searches for _secured_, articles containing _security_ are displayed, however as _secured_ is not in the content, it is not highlighted.
142+
143+
If the searched word is prefixed by the negation `-`, it is not highlighted.
144+
145+
{: .note }
146+
> Due to incomplete implementation, the negation with `!` is not detected, so words negated in a such way are still highlighted. Use the `-` sign instead.
147+
148+
149+
<a id="undetected_errors"></a>
150+
## Undetected errors
151+
152+
{: warning }
153+
> Due to current parser limitations, most syntax errors are undetected. When user enters a badly formatted search query, it is incorrectly parsed, no message is displayed, and the results are unexpected.
154+
155+
156+
<a id="contributions"></a>
157+
## Contributions are welcome
158+
159+
The current search query parser implements a basic keyword splitting. It works in most cases, but has the disadvantages presented above.
160+
161+
It is maintained with best effort until someone volunteers to create a full parser with:
162+
- logical operators and grouping around any type of keyword
163+
- highlighting supported in all cases
164+
- detection of invalid queries, with a warning displayed
165+
166+
Contributions are welcome!

0 commit comments

Comments
 (0)