Skip to content

Commit

Permalink
fix bug in getLargeLog() with both text and regex matches
Browse files Browse the repository at this point in the history
update README.md
  • Loading branch information
mikebd committed Nov 3, 2023
1 parent a237860 commit eecbae8
Show file tree
Hide file tree
Showing 3 changed files with 90 additions and 103 deletions.
172 changes: 81 additions & 91 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -34,9 +34,6 @@ Some assumptions could be considered TODO items for future enhancement
* Hot searches could have dedicated caches that are eagerly refreshed when the file changes
* Testing coverage should be added for the `controller` package
* Add `GOMAXPROCS` to Dockerfile or set that in the code
* There is a [get-large-log](https://github.com/mikebd/go-read-var-log/tree/get-large-log) branch where I hope to
experiment with a more efficient way to read very large files, but it is not yet complete and is not formally
part of this submission

## Endpoints

Expand Down Expand Up @@ -130,23 +127,23 @@ wifi.log
```bash
❯ go run ./main -p 80
...
logs.go:30: /api/v1/logs/1GB-9million.log?q=error&r=\sfecig$ returned 1 lines in 2.75945789s
logs.go:30: /api/v1/logs/1GB-9million.log?q=error&r=\sfecig$ returned 1 lines in 2.534780081s
logs.go:30: /api/v1/logs/1GB-9million.log?q=error&r=\sfecig$ returned 1 lines in 2.148826909s
logs.go:30: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$ returned 1 lines in 1.427393941s
logs.go:30: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$ returned 1 lines in 1.447307855s
logs.go:30: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$ returned 1 lines in 1.421464319s
logs.go:31: /api/v1/logs/1GB-9million.log?q=error&r=\sfecig$&n=1 returned 1 lines using large strategy in 1.23877778s
logs.go:31: /api/v1/logs/1GB-9million.log?q=error&r=\sfecig$&n=1 returned 1 lines using large strategy in 1.202586854s
logs.go:31: /api/v1/logs/1GB-9million.log?q=error&r=\sfecig$&n=1 returned 1 lines using large strategy in 1.198800304s
logs.go:31: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1 returned 1 lines using large strategy in 856.594448ms
logs.go:31: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1 returned 1 lines using large strategy in 866.171423ms
logs.go:31: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1 returned 1 lines using large strategy in 863.319268ms
```

☝️ was generated by the following `curl` commands:

```bash
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=error&r=\sfecig$"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=error&r=\sfecig$"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=error&r=\sfecig$"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=error&r=\sfecig$&n=1"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=error&r=\sfecig$&n=1"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=error&r=\sfecig$&n=1"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1"
❯ curl "localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1"
```

The `curl` output for all of the above is:
Expand All @@ -161,128 +158,121 @@ The `curl` output for all of the above is:

### Small file

`/var/log/daily.out` is a 419KB file with 6,892 lines.
`/var/log/daily.out` is a 498KB file with 7,992 lines.

```bash
❯ ab -c 6 -k -n 10000 'localhost/api/v1/logs/daily.out'

Document Path: /api/v1/logs/daily.out
Document Length: 1765 bytes
Document Length: 1930 bytes

Concurrency Level: 6
Time taken for tests: 3.193 seconds
Time taken for tests: 2.062 seconds
Complete requests: 10000
Failed requests: 0
Keep-Alive requests: 10000
Total transferred: 19080000 bytes
HTML transferred: 17650000 bytes
Requests per second: 3131.67 [#/sec] (mean)
Time per request: 1.916 [ms] (mean)
Time per request: 0.319 [ms] (mean, across all concurrent requests)
Transfer rate: 5835.18 [Kbytes/sec] received
Total transferred: 20730000 bytes
HTML transferred: 19300000 bytes
Requests per second: 4849.47 [#/sec] (mean)
Time per request: 1.237 [ms] (mean)
Time per request: 0.206 [ms] (mean, across all concurrent requests)
Transfer rate: 9817.34 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1 2 0.5 2 5
Waiting: 0 2 0.5 2 5
Total: 1 2 0.5 2 5
Processing: 0 1 0.3 1 3
Waiting: 0 1 0.3 1 3
Total: 0 1 0.3 1 3

Percentage of the requests served within a certain time (ms)
50% 2
66% 2
75% 2
80% 2
90% 3
95% 3
98% 3
99% 3
100% 5 (longest request)
50% 1
66% 1
75% 1
80% 1
90% 2
95% 2
98% 2
99% 2
100% 3 (longest request)
```
### Large file with fixed text and regex matching
#### Non-default GOMAXPROCS=8
When run for 10,000 requests, it failed somewhere between 4,000-5,000 requests with a timeout as response times escalated to over 1 minute.
For > 2,000 requests (started as 10,000 but interrupted after 2,000):
```bash
❯ ab -c 6 -k -n 10000 'localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$'
❯ ab -c 6 -k -n 1000 'localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1'

Document Path: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$
Document Path: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1
Document Length: 117 bytes

Concurrency Level: 6
Time taken for tests: 1036.804 seconds
Complete requests: 2105
Time taken for tests: 259.356 seconds
Complete requests: 1000
Failed requests: 0
Keep-Alive requests: 2105
Total transferred: 545195 bytes
HTML transferred: 246285 bytes
Requests per second: 2.03 [#/sec] (mean)
Time per request: 2955.261 [ms] (mean)
Time per request: 492.543 [ms] (mean, across all concurrent requests)
Transfer rate: 0.51 [Kbytes/sec] received
Keep-Alive requests: 1000
Total transferred: 259000 bytes
HTML transferred: 117000 bytes
Requests per second: 3.86 [#/sec] (mean)
Time per request: 1556.135 [ms] (mean)
Time per request: 259.356 [ms] (mean, across all concurrent requests)
Transfer rate: 0.98 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.0 0 0
Processing: 1905 2946 1771.9 2766 36280
Waiting: 1905 2946 1771.9 2766 36280
Total: 1905 2946 1771.9 2766 36280
Processing: 909 1550 51.8 1541 1788
Waiting: 908 1550 51.8 1540 1788
Total: 909 1550 51.8 1541 1788

Percentage of the requests served within a certain time (ms)
50% 2766
66% 2811
75% 2853
80% 2885
90% 3007
95% 3225
98% 4481
99% 6722
100% 36280 (longest request)
50% 1541
66% 1559
75% 1568
80% 1578
90% 1605
95% 1632
98% 1692
99% 1757
100% 1788 (longest request)
```
#### Default GOMAXPROCS=1
This fails for `-n 10000` with a timeout, but "succeeds" (with poor performance) for `-n 50`.
There is a lot of room for optimization here if this is a use case that must be supported.
```bash
❯ ab -c 6 -k -n 50 'localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$'
❯ ab -c 6 -k -n 1000 'localhost/api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1'

Document Path: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$
Document Path: /api/v1/logs/1GB-9million.log?q=|error|&r=\sfecig$&n=1
Document Length: 117 bytes

Concurrency Level: 6
Time taken for tests: 31.893 seconds
Complete requests: 50
Time taken for tests: 279.119 seconds
Complete requests: 1000
Failed requests: 0
Keep-Alive requests: 50
Total transferred: 12950 bytes
HTML transferred: 5850 bytes
Requests per second: 1.57 [#/sec] (mean)
Time per request: 3827.187 [ms] (mean)
Time per request: 637.865 [ms] (mean, across all concurrent requests)
Transfer rate: 0.40 [Kbytes/sec] received
Keep-Alive requests: 1000
Total transferred: 259000 bytes
HTML transferred: 117000 bytes
Requests per second: 3.58 [#/sec] (mean)
Time per request: 1674.712 [ms] (mean)
Time per request: 279.119 [ms] (mean, across all concurrent requests)
Transfer rate: 0.91 [Kbytes/sec] received

Connection Times (ms)
min mean[+/-sd] median max
Connect: 0 0 0.1 0 0
Processing: 1832 3469 1508.0 2750 6374
Waiting: 1832 3469 1508.0 2750 6374
Total: 1832 3469 1508.0 2750 6374
Connect: 0 0 0.0 0 0
Processing: 1276 1663 53.5 1654 2018
Waiting: 1276 1663 53.5 1653 2018
Total: 1276 1663 53.5 1654 2018

Percentage of the requests served within a certain time (ms)
50% 2750
66% 3738
75% 5137
80% 5368
90% 5905
95% 6206
98% 6374
99% 6374
100% 6374 (longest request)
50% 1654
66% 1667
75% 1679
80% 1687
90% 1714
95% 1745
98% 1803
99% 1932
100% 2018 (longest request)
```
17 changes: 9 additions & 8 deletions service/getLog.go
Original file line number Diff line number Diff line change
Expand Up @@ -83,14 +83,15 @@ func getLargeLog(params *GetLogParams) GetLogResult {
for scanner.Scan() {
line := scanner.Text()

if params.matchRequested() {
// Cheaper tests first, short circuit more expensive tests
if params.textMatchRequested() && strings.Contains(line, params.TextMatch) {
blockResult = append(blockResult, line)
} else if params.regexMatchRequested() && params.Regex.MatchString(line) {
blockResult = append(blockResult, line)
}
} else {
matched := true
// Cheaper tests first, short circuit more expensive tests
if params.textMatchRequested() && !strings.Contains(line, params.TextMatch) {
matched = false
}
if matched && params.regexMatchRequested() && !params.Regex.MatchString(line) {
matched = false
}
if matched {
blockResult = append(blockResult, line)
}
}
Expand Down
4 changes: 0 additions & 4 deletions service/getLogTypes.go
Original file line number Diff line number Diff line change
Expand Up @@ -31,10 +31,6 @@ func (p *GetLogParams) regexMatchRequested() bool {
return p.Regex != nil
}

func (p *GetLogParams) matchRequested() bool {
return p.textMatchRequested() || p.regexMatchRequested()
}

type GetLogResult struct {
LogLines []string
Strategy string
Expand Down

0 comments on commit eecbae8

Please sign in to comment.