recognitionの人気記事 51件 - はてなブックマーク

1 - 40 件 / 51件

新着順人気順

絞り込み

検索対象
ブックマーク数
期間
セーフサーチ

recognitionの検索結果1 - 40 件 / 51件

タグ検索の該当結果が少ないため、タイトル検索結果を表示しています。

recognitionに関するエントリは51件あります。機械学習、 AI、 ocr などが関連タグです。人気エントリには『実務で使う固有表現抽出 / Practical Use of Named Entity Recognition』などがあります。

実務で使う固有表現抽出 / Practical Use of Named Entity Recognition
- 56 users
- speakerdeck.com/sansandsoc
- テクノロジー
- 2020/10/12
■イベント  ：自然言語処理勉強会 https://sansan.connpass.com/event/190157/ ■登壇概要タイトル：実務で使う固有表現抽出発表者：  DSOC R&D研究員高橋寛治 ▼Twitter https://twitter.com/SansanRandD
GitHub - openai/whisper: Robust Speech Recognition via Large-Scale Weak Supervision
- 56 users
- github.com/openai
- 学び
- 2022/09/17
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- OpenAI
- Whisper
- translate
- voice
- audio
- 機械学習
- AI
- Tech
プロダクト開発の貢献をアピールするための目標設計や認知活動 / Goal design and recognition activities to promote product development contributions.
- 50 users
- speakerdeck.com/oomatomo
- テクノロジー
- 2024/10/09
エンジニア組織の成果を伝えたい！経営層や非エンジニア組織との会話、どうしてる？ https://d-plus.connpass.com/event/331345/
GitHub - kha-white/manga-ocr: Optical character recognition for Japanese text, with the main focus being Japanese manga
- 46 users
- github.com/kha-white
- テクノロジー
- 2022/09/27
Optical character recognition for Japanese text, with the main focus being Japanese manga. It uses a custom end-to-end model built with Transformers' Vision Encoder Decoder framework. Manga OCR can be used as a general purpose printed Japanese OCR, but its main goal was to provide a high quality text recognition, robust against various scenarios specific to manga: both vertical and horizontal text
- ocr
- github
- 画像処理
- comic
- 漫画
- oss
- あとで見る
- Python

iOSで文字認識（Text Recognition）
- 22 users
- zenn.dev/shu223
- テクノロジー
- 2023/11/12
iOS 13以降で、待望だった「文字認識」機能が使えるようになりました。カメラなどで撮影した画像内にある文字を読み取る [1] ことができます。「文字検出」との違い文字認識は、Visionフレームワークの一機能として追加されました。一方、Core ImageのCIDetectorというクラスでは、CIDetectorTypeTextというタイプを指定でき、テキストを検出することができます。このCIDetectorTypeTextやCIFeatureTypeTextはiOS 9からあるものです。しかしこちらは文字の「領域」を検出する機能です。何が書いてあるか、までは認識できませんでした。またiOS 11で登場したVisionフレームワークでは VNDetectTextRectanglesRequest という文字領域を検出するクラスを当初から利用できましたが、これも文字の「領域」
GitHub - alphacep/vosk-api: Offline speech recognition API for Android, iOS, Raspberry Pi and servers with Python, Java, C# and Node
- 18 users
- github.com/alphacep
- テクノロジー
- 2022/05/16
Vosk is an offline open source speech recognition toolkit. It enables speech recognition for 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish. More to come. Vosk models are small (50 Mb) but p
VOSK Offline Speech Recognition API
- 16 users
- alphacephei.com
- テクノロジー
- 2021/07/31
РУС 中文 Vosk is a speech recognition toolkit. The best things in Vosk are: Supports 20+ languages and dialects - English, Indian English, German, French, Spanish, Portuguese, Chinese, Russian, Turkish, Vietnamese, Italian, Dutch, Catalan, Arabic, Greek, Farsi, Filipino, Ukrainian, Kazakh, Swedish, Japanese, Esperanto, Hindi, Czech, Polish, Uzbek, Korean, Breton, Gujarati, Tajik, Telugu. More to com
- あとで読む
PimEyes: Face Recognition Search Engine and Reverse Image Search |
- 15 users
- pimeyes.com
- 世の中
- 2020/10/18
Face Search Engine Reverse Image Search Upload photo and find out where images are published
- 検索
- AI
- search
- tool
- これはすごい
- webサービス
GitHub - xuebinqin/U-2-Net: The code for our newly accepted paper in Pattern Recognition 2020: "U^2-Net: Going Deeper with Nested U-Structure for Salient Object Detection."
- 15 users
- github.com/xuebinqin
- テクノロジー
- 2020/05/27
** (2022-Aug.-24) ** We are glad to announce that our U2-Net published in Pattern Recognition has been awarded the 2020 Pattern Recognition BEST PAPER AWARD !!! ** (2022-Aug.-17) ** Our U2-Net models are now available on PlayTorch, where you can build your own demo and run it on your Android/iOS phone. Try out this demo on and bring your ideas about U2-Net to truth in minutes! ** (2022-Jul.-5)** O
GitHub - PaddlePaddle/PaddleOCR: Awesome multilingual OCR toolkits based on PaddlePaddle (practical ultra lightweight OCR system, support 80+ languages recognition, provide data annotation and synthesis tools, support training and deployment among server,
- 14 users
- github.com/PaddlePaddle
- テクノロジー
- 2020/10/07
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- ocr
- machinelearning
- 機械学習
- tools
- mobile
How Disney uses PyTorch for animated character recognition
- 11 users
- medium.com
- テクノロジー
- 2020/07/17
Authors: Miquel Àngel Farré, Anthony Accardo, Marc Junyent, Monica Alfaro, Cesc Guitart at Disney Disney’s Content GenomeThe long and incremental evolution of the media industry, from a traditional broadcast and home video model, to a more mixed model with increasingly digitally-accessible content, has accelerated the use of machine learning and artificial intelligence (AI). Advancing the implemen
- PyTorch
- 人工知能
DeNA, MoT AI勉強会発表資料「顔認識と最近のArcFaceまわりと」 / Face Recognition & ArcFace papers
- 11 users
- speakerdeck.com/takarasawa_
- テクノロジー
- 2022/05/14
DeNA, Mobility TechnologiesのAI勉強会で発表した資料です・顔認識分野周りってどんな感じなの・特に、最近のArcFaceまわりの手法どうなってきてるの紹介論文： AdaptiveFace (CVPR’19) AdaCos (CVPR’19） (MV-ArcFace (AAAI’20)) CurricularFace (CVPR’20) GroupFace (CVPR’20) Sub-center ArcFace (ECCV’20) MagFace (CVPR’21) ElasticFace (CVPRW’22) AdaFace (CVPR’22)
- あとで読む
GitHub - serengil/deepface: A Lightweight Face Recognition and Facial Attribute Analysis (Age, Gender, Emotion and Race) Library for Python
- 8 users
- github.com/serengil
- テクノロジー
- 2021/06/25
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
Face Recognition @ ECCV2022
- 7 users
- speakerdeck.com/takarasawa_
- テクノロジー
- 2023/02/13
DeNA, Mobility TechnologiesのAI勉強会で発表した資料です face recognition分野の最新論文のキャッチアップ。ECCV 2022。紹介論文：・Teaching Where to Look: Attention Similarity Knowledge…
- あとで読む
GitHub - VikParuchuri/surya: OCR, layout analysis, reading order, table recognition in 90+ languages
- 7 users
- github.com/VikParuchuri
- テクノロジー
- 2024/01/15
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
Suppression of RNA recognition by Toll-like receptors: the impact of nucleoside modification and the evolutionary origin of RNA - PubMed
- 6 users
- pubmed.ncbi.nlm.nih.gov
- 世の中
- 2020/12/14
The .gov means it’s official. Federal government websites often end in .gov or .mil. Before sharing sensitive information, make sure you’re on a federal government site. The site is secure. The https:// ensures that you are connecting to the official website and that any information you provide is encrypted and transmitted securely.
GitHub - m-bain/whisperX: WhisperX: Automatic Speech Recognition with Word-level Timestamps (& Diarization)
- 6 users
- github.com/m-bain
- テクノロジー
- 2023/03/16
This repository provides fast automatic speech recognition (70x realtime with large-v2) with word-level timestamps and speaker diarization. ⚡️ Batched inference for 70x realtime transcription using whisper large-v2 🪶 faster-whisper backend, requires <8GB gpu memory for large-v2 with beam_size=5 🎯 Accurate word-level timestamps using wav2vec2 alignment 👯‍♂️ Multispeaker ASR using speaker diariza
A Visual History of Interpretation for Image Recognition
- 6 users
- thegradient.pub
- テクノロジー
- 2021/01/21
Image recognition (i.e. classifying what object is shown in an image) is a core task in computer vision, as it enables various downstream applications (automatically tagging photos, assisting visually impaired people, etc.), and has become a standard task on which to benchmark machine learning (ML) algorithms. Deep learning (DL) algorithms have, over the past decade, emerged as the most competitiv
Web Worker を使ってブラウザ上でポケモンの画像を解析したい！ / Pokemon recognition from screenshots in browser using web worker
- 5 users
- speakerdeck.com/potato4d
- アニメとゲーム
- 2020/05/11
Universal な Worker を用意しだしたのは良いけれど、なんやかんやで最後 worker_threads が要らなくなって Web Worker オンリーに完全移行したまでがオチです。社内発表タイトルは「ブラウザ上でポケモンの画像を解析したい！」です。 2020/05/11 に…
- あとで読む
GitHub - sdkcarlos/artyom.js: A voice control - voice commands - speech recognition and speech synthesis javascript library. Create your own siri,google now or cortana with Google Chrome within your website.
- 5 users
- github.com/sdkcarlos
- テクノロジー
- 2020/04/11
Due to abuse of users with the Speech Synthesis API (ADS, Fake system warnings), Google decided to remove the usage of the API in the browser when it's not triggered by an user gesture (click, touch etc.). This means that calling for example artyom.say("Hello") if it's not wrapped inside an user event won't work. So on every page load, the user will need to click at least once time per page to all
High-Performance Large-Scale Image Recognition Without Normalization
- 5 users
- arxiv.org
- テクノロジー
- 2021/02/17
Batch normalization is a key component of most image classification models, but it has many undesirable properties stemming from its dependence on the batch size and interactions between examples. Although recent work has succeeded in training deep ResNets without normalization layers, these models do not match the test accuracies of the best batch-normalized networks, and are often unstable for l
- image
GitHub - NVIDIA/NeMo: A scalable generative AI framework built for researchers and developers working on Large Language Models, Multimodal, and Speech AI (Automatic Speech Recognition and Text-to-Speech)
- 5 users
- github.com/NVIDIA
- テクノロジー
- 2020/11/30
Large Language Models and Multimodal Models New Llama 3.1 Support (2024-07-23) The NeMo Framework now supports training and customizing the Llama 3.1 collection of LLMs from Meta. Accelerate your Generative AI Distributed Training Workloads with the NVIDIA NeMo Framework on Amazon EKS (2024-07-16) NVIDIA NeMo Framework now runs distributed training workloads on an Amazon Elastic Kubernetes Service
Clearview AI | Facial Recognition
- 5 users
- www.clearview.ai
- テクノロジー
- 2020/03/26
Clearview AI’s investigative platform allows law enforcement to rapidly generate leads to help identify suspects, witnesses and victims to close cases faster and keep communities safe. Learn More >
- AI
- company
- service
名寄せ（entity recognition, deduplication) で使える特徴量 - Qiita
- 5 users
- qiita.com/daimonji-bucket
- テクノロジー
- 2020/08/13
Deleted articles cannot be recovered. Draft of this article would be also deleted. Are you sure you want to delete this article? 名寄せ（entity recognition, deduplication) で使える特徴量レコードやオブジェクトを教師あり学習・教師なし学習や検索エンジンで名寄せ(Entity Recognition・Deduplication)するときに、それぞれのフィールドから特徴量を抜き出す必要があります。意外とまとまって言及しているリファレンスは少ないので、特に文字列のフィールドでよく使われる特徴量を上げてみました。データベースのブロッキングに使われるものも含まれます。特徴量の種類分類は独自の基準に基づきます。 Token 固有
- あとで読む
GitHub - open-mmlab/mmocr: OpenMMLab Text Detection, Recognition and Understanding Toolbox
- 5 users
- github.com/open-mmlab
- テクノロジー
- 2021/04/08
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
- PyTorch
- OCR
- tech
An Image is Worth 16x16 Words: Transformers for Image Recognition at Scale
- 4 users
- arxiv.org
- テクノロジー
- 2021/05/01
While the Transformer architecture has become the de-facto standard for natural language processing tasks, its applications to computer vision remain limited. In vision, attention is either applied in conjunction with convolutional networks, or used to replace certain components of convolutional networks while keeping their overall structure in place. We show that this reliance on CNNs is not nece
- google
Pythonで手軽に顔認識をやってみる(face-recognition)
- 4 users
- blog.grasys.io
- テクノロジー
- 2023/06/22
はじめまして！エンジニアのUemaです。近年では、スマホのロックの解除や入館時の認証など様々なことに顔認識の技術が使われています。顔認識を利用するには機械学習、画像処理や数学などの様々な知識が必要で学習コストがかかり、顔認識を使ってアプリケーションを作ってみたいと考えている人もなかなか手が出ないと思います。そんな人に朗報です！手軽に顔認識を行えるface-recognitionというPythonライブラリが存在します！今回は顔認識の入り口として、face-recognitionを実際に使ってみたいと思います。 face-recognitionとは Pythonコードやコマンドラインで手軽に顔を検出・認識することができるライブラリです。face-recognitionの顔認識モデルは99%の正解率を記録しているそうです。インストール(mac) Pythonとhomebrewがイン
- 機械学習
- 勉強
- AI
iOS 14で追加された音声認識機能（Sound Recognition）がちょっと怖いらしい「絶対オンにしないな」「不気味すぎるよ」|ガジェット通信 GetNews
- 4 users
- getnews.jp
- テクノロジー
- 2020/06/28
iOS 14 comes with support for Sound Recognition in Accessibility. Your phone can now listen for specific sounds – a baby crying, smoke alarm, water running, etc. – and notify you. Amazing feature for all kinds of users – inclusivity at its best. #WWDC2020 pic.twitter.com/3hIL8JuTyB— Federico Viticci (@viticci) June 23, 2020
- 参考
- 文化
- Apple
Facial recognition identifies extremists storming the Capitol
- 4 users
- www.washingtontimes.com
- 学び
- 2021/01/07
Correction: An earlier version of this story incorrectly stated that XRVision facial recognition software identified Antifa members among rioters who stormed the Capitol Wednesday. XRVision did not identify any Antifa members. The Washington Times apologizes to XRVision for the error. Facial recognition software has identified neo-Nazis and other extremists as participants in Wednesday’s assault o
Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition
- 4 users
- speakerdeck.com/sansandsoc
- テクノロジー
- 2021/04/18
■イベント  ：第六回　全日本コンピュータビジョン勉強会 https://kantocv.connpass.com/event/205271/ ■登壇概要タイトル：Read Like Humans: Autonomous, Bidirectional and Iterative Langua…
- Transformer
- 機械学習
GitHub - exadel-inc/CompreFace: Leading free and open-source face recognition system
- 3 users
- github.com/exadel-inc
- テクノロジー
- 2021/02/23
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
GitHub - DigitalNatureGroup/Remote_Voice_Recognition: リモートミーティングでの音声認識の活用事例
- 3 users
- github.com/DigitalNatureGroup
- テクノロジー
- 2020/05/20
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
GitHub - ccoreilly/vosk-browser: A speech recognition library running in the browser thanks to a WebAssembly build of Vosk
- 3 users
- github.com/ccoreilly
- テクノロジー
- 2022/05/19
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert
Wav2vec: Semi-supervised and Unsupervised Speech Recognition
- 3 users
- vaclavkosar.com
- テクノロジー
- 2021/07/04
Word2vec for audio quantizes phonemes, transforms, GAN trains on text and audio from Facebook AI. JS disabled! Watch Wav2vec: Semi-supervised and Unsupervised Speech Recognition on Youtube Watch video "Wav2vec: Semi-supervised and Unsupervised Speech Recognition" Wav2vec is fascinating in that it combines several neural network architectures and methods: CNN, transformer, quantization, and GAN tra
Chrome 91: Handwriting Recognition, WebXR Plane Detection and More
- 3 users
- blog.chromium.org
- テクノロジー
- 2021/04/23
$200K 1 10th birthday 4 abusive ads 1 abusive notifications 2 accessibility 3 ad blockers 1 ad blocking 2 advanced capabilities 1 android 2 anti abuse 1 anti-deception 1 background periodic sync 1 badging 1 benchmarks 1 beta 83 better ads standards 1 billing 1 birthday 4 blink 2 browser 2 browser interoperability 1 bundles 1 capabilities 6 capable web 1 cds 1 cds18 2 cds2018 1 chrome 35 chrome 81
- Chrome
Detexify LaTeX handwritten symbol recognition
- 3 users
- detexify.kirelabs.org
- テクノロジー
- 2021/03/15
Did this help? Hosting Detexify costs money and if it helps you may consider helping to pay the hosting bill. Want a Mac app? Lucky you. The Mac app is finally stable enough. See how it works on Vimeo. Download the latest version here. Restriction: In addition to the LaTeX command the unlicensed version will copy a reminder to purchase a license to the clipboard when you select a symbol. You can p
Facial recognition in school renders Sweden’s first GDPR fine | European Data Protection Board
- 3 users
- www.edpb.europa.eu
- 学び
- 2021/09/24
Vårt arbete och våra verktyg General Guidance Riktlinjer, rekommendationer, bästa praxis Offentligt samråd Andra dokument Support Cooperation and Enforcement GDPR Cooperation and Enforcement Consistency and Cooperation procedures International Cooperation & Cooperation with Other Authorities Registers Final One Stop Shop Decisions Approved Binding Corporate Rules Codes of Conduct, amendments and e
Handwriting Recognition with ML (An In-Depth Guide)
- 3 users
- nanonets.com
- 暮らし
- 2020/08/28
Handwriting Recognition with ML (An In-Depth Guide)
handwriting-recognition/explainer.md at main · WICG/handwriting-recognition
- 3 users
- github.com/WICG
- テクノロジー
- 2020/11/20
Handwriting is a widely used input method, one key usage is to recognize the texts when users are drawing. This feature already exists on many operating systems (e.g. handwriting input methods). However, the web platform as of today doesn't have this capability, the developers need to integrate with third-party libraries (or cloud services), or to develop native apps. We want to add handwriting re
How Disney Improved Activity Recognition Through Multimodal Approaches with PyTorch
- 3 users
- pytorch.org
- テクノロジー
- 2022/06/18
by Monica Alfaro, Albert Aparicio, Francesc Guitart, Marc Junyent, Pablo Pernias, Marcel Porta, and Miquel Àngel Farré (former Senior Technology Manager) Introduction Among the many things Disney Media & Entertainment Distribution (DMED) is responsible for, is the management and distribution of a huge array of media assets including news, sports, entertainment and features, episodic programs, mark

新着記事

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx