-
PADS: A Language and System for Automatic Tool Generation from Ad Hoc Data Sources
An ad hoc data source is any semistructured data source for which useful data analysis and transformation tools are not readily available. Such data must be queried, transformed and displayed by systems administrators, computational biologists, financial analysts and hosts of others on a regular basis. PADS is a domain-specific language extension for C and O'Caml that allows programmers to specify the formats of ad hoc data sources using a set of type declarations. The PADS compiler generates a collection of useful tools from these declarations including a parser, printer, data validator, formatter, error profiler, xml converter and query engine. Programmers may use PADS by writing a description by hand or by asking the system to infer a pads description directly from example data. ...
published: 06 Sep 2016
-
Pablo thesis defense : Adaptive Semantic Annotation of Entity and Concept Mentions in Text
The recent years have seen an increase in interest for knowledge repositories that are useful across applications, in contrast to the creation of ad hoc or application-specific databases. These knowledge repositories figure as a central provider of unambiguous identifiers and semantic relationships between entities. As such, these shared entity descriptions serve as a common vocabulary to exchange and organize information in different formats and for different purposes. Therefore, there has been remarkable interest in systems that are able to automatically tag textual documents with identifiers from shared knowledge repositories so that the content in those documents is described in a vocabulary that is unambiguously understood across applications.
Tagging textual documents according t...
published: 21 Dec 2013
-
Analysing publications and funding with the Europe PMC REST API
Europe PMC is a database of life science publications and preprints. Articles in Europe PMC are enriched with additional resources, such as author, funding and affiliation information, citations and links to data, peer reviews and more.
In this webinar, we will explain how to programmatically access publications and related information using the Europe PMC Articles RESTful API. Michael Parkin, Data Scientist at Europe PMC, will demonstrate how to access the API using the browser and provide some scripting examples in Python/R. He will outline different API methods, rules for filtering and pagination, as well as output formats. Following on, Antonio Campello, Senior Data Scientist at the Wellcome Trust, will explain how Wellcome uses Europe PMC APIs to gain insights into the outputs of the ...
published: 24 Nov 2021
-
Cataloging, Gender, and RDA Rule 9.7
http://ala.org/alcts
An ALCTS webinar.
One of the paradoxes at the heart of library cataloging and classification is the demand to fix in place elements of a record even when those elements are in flux. We have to name things in order to locate them, which means we can’t escape encounters with the politics of naming. This webinar will discuss the particular example of RDA rule 9.7, a rule that, if recorded, required gender to be fixed in RDA-compliant name authority records until a group of catalogers fought to make it optional.
Prior to January 2016, rule 9.7 directed catalogers to record gender when identifying persons. Although RDA gave catalogers the flexibility to record more than two gender labels, RDA rule 9.7 limited Name Authority Cooperative Program (NACO) catalogers to a bin...
published: 20 Oct 2017
-
"Natural Language as a Cognitive Computation on Data-compressed grounded symbols" by Douglas Brash
This is a ~1 hour talk on linguistics and computation by Douglas Brash (https://medicine.yale.edu/profile/douglas-brash/, https://scholar.google.com/citations?user=QHCv7ZIAAAAJ&hl=en).
published: 16 Jul 2024
-
T18.2: Getting Situated: Comparative Analysis of Language Models With Experimental Categorization...
Andy Edinger
published: 27 Oct 2022
-
Ambiguity in Natural language processing in Hindi | NLP series #3
In this video, we have explained the Ambiguity in Natural language processing
Take the Full course of Natural Language Processing: https://bit.ly/3aWsmnJ
Other Semester 08 Courses:
[Bundles]
[Human Machine Interaction + Distributed Computing + Adhoc Wireless Networks] https://bit.ly/2vpD4nH
[Human Machine Interaction + Distributed Computing] https://bit.ly/2WhyoLq
[Human Machine Interaction + Distributed Computing + Natural Language Processing] https://bit.ly/2IOlX1R
[Individual Courses]
Human Machine Interaction: https://bit.ly/33qnyV1
Parallel and Distributed Computing: https://bit.ly/2Qj9M13
Ad-Hoc Wireless Networks: https://bit.ly/38Qe3Q5
Natural Language Processing: https://bit.ly/3aWsmnJ
Take the Full course of Machine Learning: https://lastmomenttuitions.com/course/machine-lea...
published: 04 Sep 2020
-
Elasticsearch Data Exploration in Your Terminal - Brad Lhotsky
You've seen the pretty graphs. Visuals are great for signaling there is a problem somewhere. How do you go from pretty graphs to root cause analysis? Let's talk more about integrating Elasticsearch-based dashboards back to the command line workflows I love.
This talk is an overview of a tool I developed while working at Booking.com to drastically reduce the time and complexity of performing incident response against rich, structured data in Elasticsearch. It was developed with the help of the security and fraud teams to perform ad-hoc queries critical for incident response. The tool served the team well and it's been under active development ever since. It continues to grow in capabilities aimed to make ad-hoc analysis simple, easy, and accessible to hardened command line jockeys and comm...
published: 09 Jun 2021
-
Graphs in Criminal Intelligence - Webinar | GraphAware Hume
The webinar Graphs in Criminal Intelligence was run by Dan Newland, GraphAwares General Manager ANZ, who is highly experienced with the implementation of graph solutions in several government organizations in Australia. Graphs can be truly transformational for law enforcement agencies. Learn how a cutting-edge graph solution removes obstacles from the criminal intelligence process and increases its efficiency.
---------
0:00 Introduction webinar
1:25 Introduction Dan Newland
2:49 What is Neo4j and what is Hume?
4:50 Graph analytics introduction
9:30 The challenges intelligence teams face
22:00 Practical implementation of intelligence graphs
37:52 Demonstration of Hume Intelligence Analysis
50:55 Q&A
---------
Hume is an enterprise-level graph analytics solution that allows you to easil...
published: 26 Oct 2022
-
$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similar
Original paper: https://arxiv.org/abs/2407.18134
Title: $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Authors: Vlad Sobal, Mark Ibrahim, Randall Balestriero, Vivien Cabannes, Diane Bouchacourt, Pietro Astolfi, Kyunghyun Cho, Yann LeCun
Abstract:
Learning good representations involves capturing the diverse ways in which data samples relate. Contrastive loss - an objective matching related samples - underlies methods from self-supervised to multimodal learning. Contrastive losses, however, can be viewed more broadly as modifying a similarity graph to indicate how samples should relate in the embedding space. This view reveals a shortcoming in contrastive learning: the similarity graph is binary, as only one sample is the related positiv...
published: 16 Sep 2024
1:08:09
PADS: A Language and System for Automatic Tool Generation from Ad Hoc Data Sources
An ad hoc data source is any semistructured data source for which useful data analysis and transformation tools are not readily available. Such data must be qu...
An ad hoc data source is any semistructured data source for which useful data analysis and transformation tools are not readily available. Such data must be queried, transformed and displayed by systems administrators, computational biologists, financial analysts and hosts of others on a regular basis. PADS is a domain-specific language extension for C and O'Caml that allows programmers to specify the formats of ad hoc data sources using a set of type declarations. The PADS compiler generates a collection of useful tools from these declarations including a parser, printer, data validator, formatter, error profiler, xml converter and query engine. Programmers may use PADS by writing a description by hand or by asking the system to infer a pads description directly from example data. The multi-phase inference algorithm operates by inferring a candidate format and then optimizing it relative to an information-theorectic scoring function. Inferred descriptions may be automatically pushed through PADS compiler to generate fully functional tools with no human intervention. The entire process takes just seconds to complete on 1K of example data, and has the potential to greatly improve the productivity of data analysis. This ongoing research is a collaboration between AT&T research and Princeton University. It involves Mary Fernandez, Kathleen Fisher, Yitzhak Mandelbaum, David Walker, Qian Xi, and Kenny Zhu. More information, software and research papers are available at www.padsproj.org.
https://wn.com/Pads_A_Language_And_System_For_Automatic_Tool_Generation_From_Ad_Hoc_Data_Sources
An ad hoc data source is any semistructured data source for which useful data analysis and transformation tools are not readily available. Such data must be queried, transformed and displayed by systems administrators, computational biologists, financial analysts and hosts of others on a regular basis. PADS is a domain-specific language extension for C and O'Caml that allows programmers to specify the formats of ad hoc data sources using a set of type declarations. The PADS compiler generates a collection of useful tools from these declarations including a parser, printer, data validator, formatter, error profiler, xml converter and query engine. Programmers may use PADS by writing a description by hand or by asking the system to infer a pads description directly from example data. The multi-phase inference algorithm operates by inferring a candidate format and then optimizing it relative to an information-theorectic scoring function. Inferred descriptions may be automatically pushed through PADS compiler to generate fully functional tools with no human intervention. The entire process takes just seconds to complete on 1K of example data, and has the potential to greatly improve the productivity of data analysis. This ongoing research is a collaboration between AT&T research and Princeton University. It involves Mary Fernandez, Kathleen Fisher, Yitzhak Mandelbaum, David Walker, Qian Xi, and Kenny Zhu. More information, software and research papers are available at www.padsproj.org.
- published: 06 Sep 2016
- views: 82
1:23:42
Pablo thesis defense : Adaptive Semantic Annotation of Entity and Concept Mentions in Text
The recent years have seen an increase in interest for knowledge repositories that are useful across applications, in contrast to the creation of ad hoc or appl...
The recent years have seen an increase in interest for knowledge repositories that are useful across applications, in contrast to the creation of ad hoc or application-specific databases. These knowledge repositories figure as a central provider of unambiguous identifiers and semantic relationships between entities. As such, these shared entity descriptions serve as a common vocabulary to exchange and organize information in different formats and for different purposes. Therefore, there has been remarkable interest in systems that are able to automatically tag textual documents with identifiers from shared knowledge repositories so that the content in those documents is described in a vocabulary that is unambiguously understood across applications.
Tagging textual documents according to these knowledge bases is a challenging task. It involves recognizing the entities and concepts that have been mentioned in a particular passage and attempting to resolve eventual ambiguity of language in order to choose one of many possible meanings for a phrase. There has been substantial work on recognizing and disambiguating entities for specialized applications, or constrained to limited entity types and particular types of text. In the context of shared knowledge bases, since each application has potentially very different needs, systems must have unprecedented breadth and flexibility to ensure their usefulness across applications. Documents may exhibit different language and discourse characteristics, discuss very diverse topics, or require the focus on parts of the knowledge repository that are inherently harder to disambiguate. In practice, for developers looking for a system to support their use case, is often unclear if an existing solution is applicable, leading those developers to trial-and-error and ad hoc usage of multiple systems in an attempt to achieve their objective.
In this dissertation, I propose a conceptual model that unifies related techniques in this space under a common multi-dimensional framework that enables the elucidation of strengths and limitations of each technique, supporting developers in their search for a suitable tool for their needs. Moreover, the model serves as the basis for the development of flexible systems that have the ability of supporting document tagging for different use cases. I describe such an implement-tation, DBpedia Spotlight, along with extensions that we performed to the knowledge base DBpedia to support this implementation. I report evaluations of this tool on several well known data sets, and demonstrate applications to diverse use cases for further validation.
https://wn.com/Pablo_Thesis_Defense_Adaptive_Semantic_Annotation_Of_Entity_And_Concept_Mentions_In_Text
The recent years have seen an increase in interest for knowledge repositories that are useful across applications, in contrast to the creation of ad hoc or application-specific databases. These knowledge repositories figure as a central provider of unambiguous identifiers and semantic relationships between entities. As such, these shared entity descriptions serve as a common vocabulary to exchange and organize information in different formats and for different purposes. Therefore, there has been remarkable interest in systems that are able to automatically tag textual documents with identifiers from shared knowledge repositories so that the content in those documents is described in a vocabulary that is unambiguously understood across applications.
Tagging textual documents according to these knowledge bases is a challenging task. It involves recognizing the entities and concepts that have been mentioned in a particular passage and attempting to resolve eventual ambiguity of language in order to choose one of many possible meanings for a phrase. There has been substantial work on recognizing and disambiguating entities for specialized applications, or constrained to limited entity types and particular types of text. In the context of shared knowledge bases, since each application has potentially very different needs, systems must have unprecedented breadth and flexibility to ensure their usefulness across applications. Documents may exhibit different language and discourse characteristics, discuss very diverse topics, or require the focus on parts of the knowledge repository that are inherently harder to disambiguate. In practice, for developers looking for a system to support their use case, is often unclear if an existing solution is applicable, leading those developers to trial-and-error and ad hoc usage of multiple systems in an attempt to achieve their objective.
In this dissertation, I propose a conceptual model that unifies related techniques in this space under a common multi-dimensional framework that enables the elucidation of strengths and limitations of each technique, supporting developers in their search for a suitable tool for their needs. Moreover, the model serves as the basis for the development of flexible systems that have the ability of supporting document tagging for different use cases. I describe such an implement-tation, DBpedia Spotlight, along with extensions that we performed to the knowledge base DBpedia to support this implementation. I report evaluations of this tool on several well known data sets, and demonstrate applications to diverse use cases for further validation.
- published: 21 Dec 2013
- views: 284
37:44
Analysing publications and funding with the Europe PMC REST API
Europe PMC is a database of life science publications and preprints. Articles in Europe PMC are enriched with additional resources, such as author, funding and ...
Europe PMC is a database of life science publications and preprints. Articles in Europe PMC are enriched with additional resources, such as author, funding and affiliation information, citations and links to data, peer reviews and more.
In this webinar, we will explain how to programmatically access publications and related information using the Europe PMC Articles RESTful API. Michael Parkin, Data Scientist at Europe PMC, will demonstrate how to access the API using the browser and provide some scripting examples in Python/R. He will outline different API methods, rules for filtering and pagination, as well as output formats. Following on, Antonio Campello, Senior Data Scientist at the Wellcome Trust, will explain how Wellcome uses Europe PMC APIs to gain insights into the outputs of the research they fund. He will discuss several use cases, including charting Wellcome’s research portfolio, developing novel technologies to link grants to academic publications, and tracking which grants have been influential to the COVID-19 literature.
Who is this course for?
This webinar is for users interested in exploring programmatic approaches to access publications and preprints, as well as related metadata. It will also be of interest to experienced API users who want to learn more about funding portfolio analysis. Some prior knowledge of programmatic access is recommended.
Outcomes
By the end of the webinar you will be able to:
Access the Europe PMC API
Describe the structure of the Europe PMC API
Identify suitable API endpoints to extract data types of interest
Make literature searches programmatically
Interpret the RESTful webservice output
Know where to find help and documentation
This webinar was recorded on 24 November 2021.
The presentation slides are available at https://www.ebi.ac.uk/training/events/analysing-publications-and-funding-europe-pmc-rest-api/
Future webinars are listed here https://www.ebi.ac.uk/training/live-events
https://wn.com/Analysing_Publications_And_Funding_With_The_Europe_Pmc_Rest_Api
Europe PMC is a database of life science publications and preprints. Articles in Europe PMC are enriched with additional resources, such as author, funding and affiliation information, citations and links to data, peer reviews and more.
In this webinar, we will explain how to programmatically access publications and related information using the Europe PMC Articles RESTful API. Michael Parkin, Data Scientist at Europe PMC, will demonstrate how to access the API using the browser and provide some scripting examples in Python/R. He will outline different API methods, rules for filtering and pagination, as well as output formats. Following on, Antonio Campello, Senior Data Scientist at the Wellcome Trust, will explain how Wellcome uses Europe PMC APIs to gain insights into the outputs of the research they fund. He will discuss several use cases, including charting Wellcome’s research portfolio, developing novel technologies to link grants to academic publications, and tracking which grants have been influential to the COVID-19 literature.
Who is this course for?
This webinar is for users interested in exploring programmatic approaches to access publications and preprints, as well as related metadata. It will also be of interest to experienced API users who want to learn more about funding portfolio analysis. Some prior knowledge of programmatic access is recommended.
Outcomes
By the end of the webinar you will be able to:
Access the Europe PMC API
Describe the structure of the Europe PMC API
Identify suitable API endpoints to extract data types of interest
Make literature searches programmatically
Interpret the RESTful webservice output
Know where to find help and documentation
This webinar was recorded on 24 November 2021.
The presentation slides are available at https://www.ebi.ac.uk/training/events/analysing-publications-and-funding-europe-pmc-rest-api/
Future webinars are listed here https://www.ebi.ac.uk/training/live-events
- published: 24 Nov 2021
- views: 391
57:04
Cataloging, Gender, and RDA Rule 9.7
http://ala.org/alcts
An ALCTS webinar.
One of the paradoxes at the heart of library cataloging and classification is the demand to fix in place elements of a ...
http://ala.org/alcts
An ALCTS webinar.
One of the paradoxes at the heart of library cataloging and classification is the demand to fix in place elements of a record even when those elements are in flux. We have to name things in order to locate them, which means we can’t escape encounters with the politics of naming. This webinar will discuss the particular example of RDA rule 9.7, a rule that, if recorded, required gender to be fixed in RDA-compliant name authority records until a group of catalogers fought to make it optional.
Prior to January 2016, rule 9.7 directed catalogers to record gender when identifying persons. Although RDA gave catalogers the flexibility to record more than two gender labels, RDA rule 9.7 limited Name Authority Cooperative Program (NACO) catalogers to a binary controlled vocabulary: male, female, or not known. Queer theory tells us that gender simply doesn’t work this way. Gender is socially constructed and contingent. Requiring a binary label meant requiring that catalogers ignore the wishes of many trans- and gender-variant authors, as well as authors who simply did not wish to disclose their gender. With this problem in mind, a group of catalogers lobbied the international RDA Steering Committee for a rule change and ultimately succeeded. Additionally, after the rule change a PCC Ad Hoc Task Group was formed to recommend best-practices for recording gender in name authority records.
The product of both theorizing and activism, the RDA rule change also represents a powerful moment of praxis, reminding librarians that thinking and working together can change the profession for good.
Presented on March 15, 2017, by Amber Billey
https://wn.com/Cataloging,_Gender,_And_Rda_Rule_9.7
http://ala.org/alcts
An ALCTS webinar.
One of the paradoxes at the heart of library cataloging and classification is the demand to fix in place elements of a record even when those elements are in flux. We have to name things in order to locate them, which means we can’t escape encounters with the politics of naming. This webinar will discuss the particular example of RDA rule 9.7, a rule that, if recorded, required gender to be fixed in RDA-compliant name authority records until a group of catalogers fought to make it optional.
Prior to January 2016, rule 9.7 directed catalogers to record gender when identifying persons. Although RDA gave catalogers the flexibility to record more than two gender labels, RDA rule 9.7 limited Name Authority Cooperative Program (NACO) catalogers to a binary controlled vocabulary: male, female, or not known. Queer theory tells us that gender simply doesn’t work this way. Gender is socially constructed and contingent. Requiring a binary label meant requiring that catalogers ignore the wishes of many trans- and gender-variant authors, as well as authors who simply did not wish to disclose their gender. With this problem in mind, a group of catalogers lobbied the international RDA Steering Committee for a rule change and ultimately succeeded. Additionally, after the rule change a PCC Ad Hoc Task Group was formed to recommend best-practices for recording gender in name authority records.
The product of both theorizing and activism, the RDA rule change also represents a powerful moment of praxis, reminding librarians that thinking and working together can change the profession for good.
Presented on March 15, 2017, by Amber Billey
- published: 20 Oct 2017
- views: 905
49:48
"Natural Language as a Cognitive Computation on Data-compressed grounded symbols" by Douglas Brash
This is a ~1 hour talk on linguistics and computation by Douglas Brash (https://medicine.yale.edu/profile/douglas-brash/, https://scholar.google.com/citations?u...
This is a ~1 hour talk on linguistics and computation by Douglas Brash (https://medicine.yale.edu/profile/douglas-brash/, https://scholar.google.com/citations?user=QHCv7ZIAAAAJ&hl=en).
https://wn.com/Natural_Language_As_A_Cognitive_Computation_On_Data_Compressed_Grounded_Symbols_By_Douglas_Brash
This is a ~1 hour talk on linguistics and computation by Douglas Brash (https://medicine.yale.edu/profile/douglas-brash/, https://scholar.google.com/citations?user=QHCv7ZIAAAAJ&hl=en).
- published: 16 Jul 2024
- views: 3885
7:34
Ambiguity in Natural language processing in Hindi | NLP series #3
In this video, we have explained the Ambiguity in Natural language processing
Take the Full course of Natural Language Processing: https://bit.ly/3aWsmnJ
Ot...
In this video, we have explained the Ambiguity in Natural language processing
Take the Full course of Natural Language Processing: https://bit.ly/3aWsmnJ
Other Semester 08 Courses:
[Bundles]
[Human Machine Interaction + Distributed Computing + Adhoc Wireless Networks] https://bit.ly/2vpD4nH
[Human Machine Interaction + Distributed Computing] https://bit.ly/2WhyoLq
[Human Machine Interaction + Distributed Computing + Natural Language Processing] https://bit.ly/2IOlX1R
[Individual Courses]
Human Machine Interaction: https://bit.ly/33qnyV1
Parallel and Distributed Computing: https://bit.ly/2Qj9M13
Ad-Hoc Wireless Networks: https://bit.ly/38Qe3Q5
Natural Language Processing: https://bit.ly/3aWsmnJ
Take the Full course of Machine Learning: https://lastmomenttuitions.com/course/machine-learning/
Other Related Courses:
Semester 07 -
Artificial intelligence & Soft computing: https://bit.ly/33kgVnn
Digital Signal & Image Processing: https://bit.ly/2U5Lj0q
Mobile Communication & Computing: https://bit.ly/3b1woLo
Big Data Analysis: https://bit.ly/2TUnTML
Semester 05 -
Computer Networks - https://bit.ly/2mcoURH
Microprocessor - https://bit.ly/2mk7mDs
Semester 06 -
System Programming & Compiler Construction - https://bit.ly/2ma4Xei
Cryptography & System Security (paid) - https://bit.ly/2mdw7kw
Data Warehousing & Mining (paid) - https://bit.ly/2PRCqoP
Machine Learning (paid) - https://bit.ly/2Xp4dmH
Software Engineering (paid) - https://bit.ly/2lRb9bb
https://wn.com/Ambiguity_In_Natural_Language_Processing_In_Hindi_|_Nlp_Series_3
In this video, we have explained the Ambiguity in Natural language processing
Take the Full course of Natural Language Processing: https://bit.ly/3aWsmnJ
Other Semester 08 Courses:
[Bundles]
[Human Machine Interaction + Distributed Computing + Adhoc Wireless Networks] https://bit.ly/2vpD4nH
[Human Machine Interaction + Distributed Computing] https://bit.ly/2WhyoLq
[Human Machine Interaction + Distributed Computing + Natural Language Processing] https://bit.ly/2IOlX1R
[Individual Courses]
Human Machine Interaction: https://bit.ly/33qnyV1
Parallel and Distributed Computing: https://bit.ly/2Qj9M13
Ad-Hoc Wireless Networks: https://bit.ly/38Qe3Q5
Natural Language Processing: https://bit.ly/3aWsmnJ
Take the Full course of Machine Learning: https://lastmomenttuitions.com/course/machine-learning/
Other Related Courses:
Semester 07 -
Artificial intelligence & Soft computing: https://bit.ly/33kgVnn
Digital Signal & Image Processing: https://bit.ly/2U5Lj0q
Mobile Communication & Computing: https://bit.ly/3b1woLo
Big Data Analysis: https://bit.ly/2TUnTML
Semester 05 -
Computer Networks - https://bit.ly/2mcoURH
Microprocessor - https://bit.ly/2mk7mDs
Semester 06 -
System Programming & Compiler Construction - https://bit.ly/2ma4Xei
Cryptography & System Security (paid) - https://bit.ly/2mdw7kw
Data Warehousing & Mining (paid) - https://bit.ly/2PRCqoP
Machine Learning (paid) - https://bit.ly/2Xp4dmH
Software Engineering (paid) - https://bit.ly/2lRb9bb
- published: 04 Sep 2020
- views: 45968
53:19
Elasticsearch Data Exploration in Your Terminal - Brad Lhotsky
You've seen the pretty graphs. Visuals are great for signaling there is a problem somewhere. How do you go from pretty graphs to root cause analysis? Let's talk...
You've seen the pretty graphs. Visuals are great for signaling there is a problem somewhere. How do you go from pretty graphs to root cause analysis? Let's talk more about integrating Elasticsearch-based dashboards back to the command line workflows I love.
This talk is an overview of a tool I developed while working at Booking.com to drastically reduce the time and complexity of performing incident response against rich, structured data in Elasticsearch. It was developed with the help of the security and fraud teams to perform ad-hoc queries critical for incident response. The tool served the team well and it's been under active development ever since. It continues to grow in capabilities aimed to make ad-hoc analysis simple, easy, and accessible to hardened command line jockeys and command line newbies.
Join me to learn how to bring the logging data you love back to your terminal!
https://wn.com/Elasticsearch_Data_Exploration_In_Your_Terminal_Brad_Lhotsky
You've seen the pretty graphs. Visuals are great for signaling there is a problem somewhere. How do you go from pretty graphs to root cause analysis? Let's talk more about integrating Elasticsearch-based dashboards back to the command line workflows I love.
This talk is an overview of a tool I developed while working at Booking.com to drastically reduce the time and complexity of performing incident response against rich, structured data in Elasticsearch. It was developed with the help of the security and fraud teams to perform ad-hoc queries critical for incident response. The tool served the team well and it's been under active development ever since. It continues to grow in capabilities aimed to make ad-hoc analysis simple, easy, and accessible to hardened command line jockeys and command line newbies.
Join me to learn how to bring the logging data you love back to your terminal!
- published: 09 Jun 2021
- views: 395
58:40
Graphs in Criminal Intelligence - Webinar | GraphAware Hume
The webinar Graphs in Criminal Intelligence was run by Dan Newland, GraphAwares General Manager ANZ, who is highly experienced with the implementation of graph ...
The webinar Graphs in Criminal Intelligence was run by Dan Newland, GraphAwares General Manager ANZ, who is highly experienced with the implementation of graph solutions in several government organizations in Australia. Graphs can be truly transformational for law enforcement agencies. Learn how a cutting-edge graph solution removes obstacles from the criminal intelligence process and increases its efficiency.
---------
0:00 Introduction webinar
1:25 Introduction Dan Newland
2:49 What is Neo4j and what is Hume?
4:50 Graph analytics introduction
9:30 The challenges intelligence teams face
22:00 Practical implementation of intelligence graphs
37:52 Demonstration of Hume Intelligence Analysis
50:55 Q&A
---------
Hume is an enterprise-level graph analytics solution that allows you to easily convert multiple distributed data sources into a single, connected source of truth: the knowledge graph.
Interested in seeing what Hume could mean for your project or company? Request a demo https://graphaware.com/contact/humedemo/
Interesting articles:
- Knowledge Graphs with Entity Relations: Is Jane Austen employed by Google? - https://graphaware.com/nlp/2020/10/20/ere-jane-austen.html
- Unstructured Data Processing in Hume - https://graphaware.com/features/unstructured-data-processing.html
- A Knowledge Graph-based Perspective on Named Entity Disambiguation in the Healthcare Domain - https://graphaware.com/blog/hume/KG-based-Perspective-on-NED.html
#lawenforcement #graphanalytics #intelligenceanalysis
https://wn.com/Graphs_In_Criminal_Intelligence_Webinar_|_Graphaware_Hume
The webinar Graphs in Criminal Intelligence was run by Dan Newland, GraphAwares General Manager ANZ, who is highly experienced with the implementation of graph solutions in several government organizations in Australia. Graphs can be truly transformational for law enforcement agencies. Learn how a cutting-edge graph solution removes obstacles from the criminal intelligence process and increases its efficiency.
---------
0:00 Introduction webinar
1:25 Introduction Dan Newland
2:49 What is Neo4j and what is Hume?
4:50 Graph analytics introduction
9:30 The challenges intelligence teams face
22:00 Practical implementation of intelligence graphs
37:52 Demonstration of Hume Intelligence Analysis
50:55 Q&A
---------
Hume is an enterprise-level graph analytics solution that allows you to easily convert multiple distributed data sources into a single, connected source of truth: the knowledge graph.
Interested in seeing what Hume could mean for your project or company? Request a demo https://graphaware.com/contact/humedemo/
Interesting articles:
- Knowledge Graphs with Entity Relations: Is Jane Austen employed by Google? - https://graphaware.com/nlp/2020/10/20/ere-jane-austen.html
- Unstructured Data Processing in Hume - https://graphaware.com/features/unstructured-data-processing.html
- A Knowledge Graph-based Perspective on Named Entity Disambiguation in the Healthcare Domain - https://graphaware.com/blog/hume/KG-based-Perspective-on-NED.html
#lawenforcement #graphanalytics #intelligenceanalysis
- published: 26 Oct 2022
- views: 1811
29:23
$\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similar
Original paper: https://arxiv.org/abs/2407.18134
Title: $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Aut...
Original paper: https://arxiv.org/abs/2407.18134
Title: $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Authors: Vlad Sobal, Mark Ibrahim, Randall Balestriero, Vivien Cabannes, Diane Bouchacourt, Pietro Astolfi, Kyunghyun Cho, Yann LeCun
Abstract:
Learning good representations involves capturing the diverse ways in which data samples relate. Contrastive loss - an objective matching related samples - underlies methods from self-supervised to multimodal learning. Contrastive losses, however, can be viewed more broadly as modifying a similarity graph to indicate how samples should relate in the embedding space. This view reveals a shortcoming in contrastive learning: the similarity graph is binary, as only one sample is the related positive sample. Crucially, similarities \textit{across} samples are ignored. Based on this observation, we revise the standard contrastive loss to explicitly encode how a sample relates to others. We experiment with this new objective, called $\mathbb{X}$-Sample Contrastive, to train vision models based on similarities in class or text caption descriptions. Our study spans three scales: ImageNet-1k with 1 million, CC3M with 3 million, and CC12M with 12 million samples. The representations learned via our objective outperform both contrastive self-supervised and vision-language models trained on the same data across a range of tasks. When training on CC12M, we outperform CLIP by $0.6\%$ on both ImageNet and ImageNet Real. Our objective appears to work particularly well in lower-data regimes, with gains over CLIP of $16.8\%$ on ImageNet and $18.1\%$ on ImageNet Real when training with CC3M. Finally, our objective seems to encourage the model to learn representations that separate objects from their attributes and backgrounds, with gains of $3.3$-$5.6$\% over CLIP on ImageNet9. We hope the proposed solution takes a small step towards developing richer learning objectives for understanding sample relations in foundation models.
https://wn.com/\Mathbb_X_Sample_Contrastive_Loss_Improving_Contrastive_Learning_With_Sample_Similar
Original paper: https://arxiv.org/abs/2407.18134
Title: $\mathbb{X}$-Sample Contrastive Loss: Improving Contrastive Learning with Sample Similarity Graphs
Authors: Vlad Sobal, Mark Ibrahim, Randall Balestriero, Vivien Cabannes, Diane Bouchacourt, Pietro Astolfi, Kyunghyun Cho, Yann LeCun
Abstract:
Learning good representations involves capturing the diverse ways in which data samples relate. Contrastive loss - an objective matching related samples - underlies methods from self-supervised to multimodal learning. Contrastive losses, however, can be viewed more broadly as modifying a similarity graph to indicate how samples should relate in the embedding space. This view reveals a shortcoming in contrastive learning: the similarity graph is binary, as only one sample is the related positive sample. Crucially, similarities \textit{across} samples are ignored. Based on this observation, we revise the standard contrastive loss to explicitly encode how a sample relates to others. We experiment with this new objective, called $\mathbb{X}$-Sample Contrastive, to train vision models based on similarities in class or text caption descriptions. Our study spans three scales: ImageNet-1k with 1 million, CC3M with 3 million, and CC12M with 12 million samples. The representations learned via our objective outperform both contrastive self-supervised and vision-language models trained on the same data across a range of tasks. When training on CC12M, we outperform CLIP by $0.6\%$ on both ImageNet and ImageNet Real. Our objective appears to work particularly well in lower-data regimes, with gains over CLIP of $16.8\%$ on ImageNet and $18.1\%$ on ImageNet Real when training with CC3M. Finally, our objective seems to encourage the model to learn representations that separate objects from their attributes and backgrounds, with gains of $3.3$-$5.6$\% over CLIP on ImageNet9. We hope the proposed solution takes a small step towards developing richer learning objectives for understanding sample relations in foundation models.
- published: 16 Sep 2024
- views: 3