Skip to content
\n

And this sample testcase:

\n
def get_input_from_stdin():\n    return sys.argv[1]\n\ndef get_hardcoded_filename():\n    return \"output.txt\"\n\ndef combine_input_and_hardcoded():\n    user_input = get_input_from_stdin()\n    return f\"{user_input}_log.txt\"\n\ndef intermediate_variable():\n    part1 = get_input_from_stdin()\n    part2 = \"_data\"\n    filename = part1 + part2 + \".txt\"\n    return filename\n\ndef filename_from_multiple_functions():\n    prefix = get_input_from_stdin()\n    suffix = get_hardcoded_filename()\n    return prefix + \"_\" + suffix\n\ndef nested_function_call():\n    def inner():\n        return get_input_from_stdin()\n    filename = inner() + \"_nested.txt\"\n    return filename\n\n\ndef custom_open(filepath):\n    with open(filepath, 'w') as f:\n        f.write(\"This is a test file.\\n\")\n\ndef main():\n    custom_open(get_hardcoded_filename())\n    custom_open(combine_input_and_hardcoded())\n    custom_open(intermediate_variable())\n    custom_open(filename_from_multiple_functions())\n    custom_open(nested_function_call())\n\nif __name__ == \"__main__\":\n    main()
\n

As expected CodeQL returns me one path for each source Expr . However, I would like to keep only the most updated one. For instance, for the function intermediate_variable() , the result will report that all the expressions flow into the open() API, namely, part1, part2, get_input_from_stdin(), part1 + part2 and the literals . However, I would like to keep only the most updated source, which is filename .

\n

Obviously I could solve this by checking the line number but that's no going to fly for slightly more complex scenarios. Another solution I was thinking is to have an other DataFlow tracking to check intra-procedural flows among local expressions but before doing that I'm curious if there's a more \"CodeQL\"-like way to do this. Thanks.

","upvoteCount":1,"answerCount":2,"acceptedAnswer":{"@type":"Answer","text":"

Another option is to use DataFlow::Global instead of TaintTracking::Global. Then you will only get expressions that actually flow to the sink, without being modified in some way first.

","upvoteCount":0,"url":"https://github.com/github/codeql/discussions/20656#discussioncomment-14723271"}}}
Discussion options

You must be logged in to vote

Another option is to use DataFlow::Global instead of TaintTracking::Global. Then you will only get expressions that actually flow to the sink, without being modified in some way first.

Replies: 2 comments 3 replies

Comment options

You must be logged in to vote
1 reply
@elManto
Comment options

Comment options

You must be logged in to vote
2 replies
@owen-mc
Comment options

Answer selected by elManto
@elManto
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants