Skip to content

Conversation

@Isira-Seneviratne
Copy link
Member


Store the response headers in a TreeMap. This allows the header value retrieval to be optimised as TreeMap orders its keys using a comparator.

@TobiGr
Copy link
Contributor

TobiGr commented Jul 20, 2024

Good catch. iterating through the map elements does not make any sense. Using the default get / getOrDefault method is the proper way to go 👍

However, I do not understand why we should specifically use a TreeMap. It guarantees a look-up cost of log(n). Same for put operations. However, the usually used HashMap has O(1) if the bucket size is good and not too many items are in the Map. There shouldn't be hundreds of elements in the map, but rather 25-50. So I think using passed map is more efficient than creating a new Map which needs to process all elements first. This is unnecessary pre-processing and comes with additional computation time and requires additional space.

@Isira-Seneviratne
Copy link
Member Author

Isira-Seneviratne commented Jul 20, 2024

I've revised the code a bit, to reuse the map if it is already a SortedMap. That way, no extra objects need to be created in that scenario, such as with OkHttp's header map implementation.

Copy link
Contributor

@TobiGr TobiGr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@ShareASmile ShareASmile added enhancement New feature or request codequality Improvements to the codebase to improve the code quality labels Mar 15, 2025
@TobiGr TobiGr force-pushed the Response-TreeMap branch from 7fb4dbe to 8921c32 Compare March 15, 2025 15:51
Copy link
Contributor

@TobiGr TobiGr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A related test is failing, something is wrong with the changes

Invalid response type (got "null", excepted a JSON response) (response code 200)
org.schabi.newpipe.extractor.exceptions.ExtractionException: Invalid response type (got "null", excepted a JSON response) (response code 200)
	at org.schabi.newpipe.extractor.services.youtube.extractors.YoutubeSuggestionExtractor.suggestionList(YoutubeSuggestionExtractor.java:71)
	at org.schabi.newpipe.extractor.services.youtube.YoutubeSuggestionExtractorTest.testIfSuggestions(YoutubeSuggestionExtractorTest.java:55)

@Isira-Seneviratne
Copy link
Member Author

A related test is failing, something is wrong with the changes

Invalid response type (got "null", excepted a JSON response) (response code 200)
org.schabi.newpipe.extractor.exceptions.ExtractionException: Invalid response type (got "null", excepted a JSON response) (response code 200)
	at org.schabi.newpipe.extractor.services.youtube.extractors.YoutubeSuggestionExtractor.suggestionList(YoutubeSuggestionExtractor.java:71)
	at org.schabi.newpipe.extractor.services.youtube.YoutubeSuggestionExtractorTest.testIfSuggestions(YoutubeSuggestionExtractorTest.java:55)

I wasn't able to reproduce the issue locally, the header value is being returned as expected.

@TobiGr
Copy link
Contributor

TobiGr commented Mar 19, 2025

The Exception is only thrown when running with mocks and happens here:

final String contentTypeHeader = response.getHeader("Content-Type");
if (isNullOrEmpty(contentTypeHeader) || !contentTypeHeader.contains("application/json")) {
throw new ExtractionException("Invalid response type (got \"" + contentTypeHeader
+ "\", excepted a JSON response) (response code "
+ response.responseCode() + ")");
}

The mock response has the header content-type while the code expects Content-Type. Both use a TreeMap. The one used in the normal Downloader is case insensitive and the mock one is not. Recording the mock again did not fix the issue. This might be a serialization issue in which the gson builder does something unexpected in MockDownloader.

Copy link
Member

@Stypox Stypox left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Somewhere we have to iterate through all headers anyway: either in the Response constructor, or in the getHeader() function, since we don't know with which Map the Response is being built (it might be a SortedMap sorted according to a wrong criterion). So I would suggest recreating a new TreeMap every time in the Response constructor, since that's called only once, while getHeader might be called multiple times.

Replying to @TobiGr, I think we can't use a HashMap if we want a sorted map. But in any case, given the small amount of headers that there usually are (<20), the O(log n) of the TreeMap is actually much faster than the O(1) of the HashMap: hash maps have a big constant in the O(1) and get faster only after n grows large enough.

Another option would be to create a hash map where keys are already lowercased in the constructor, and then just accessing the map with name.lowercase() in getHeader().

In general I don't think this change is so important anyway, since the O(n) we had in getHeader() is not so bad given that n is small, and getHeader() is basically never used anyway:
image

So I would just keep the old O(n) behavior in getHeader(), but O(1) in the constructor, rather than switching to O(n) in the constructor (which is executed all the time) and O(log n)/O(1) in getHeader() (which is basically never called).

public Response(final int responseCode,
final String responseMessage,
@Nullable final Map<String, List<String>> responseHeaders,
@Nonnull final Map<String, List<String>> responseHeaders,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is this changed to be @Nonnull? Isn't this a breaking change?

Comment on lines +28 to +33
if (responseHeaders instanceof SortedMap) {
this.responseHeaders = (SortedMap<String, List<String>>) responseHeaders;
} else {
this.responseHeaders = new TreeMap<>(String.CASE_INSENSITIVE_ORDER);
this.responseHeaders.putAll(responseHeaders);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Mmmh, this still means that all items are reprocessed when building a Response in case the headers were not already in a SortedMap. Also, how can you be sure that responseHeaders was built using String.CASE_INSENSITIVE_ORDER?

I would just build a new TreeMap every time.

@Stypox Stypox closed this Mar 22, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

codequality Improvements to the codebase to improve the code quality enhancement New feature or request

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants