並び順

ブックマーク数

期間指定

  • から
  • まで

1 - 15 件 / 15件

新着順 人気順

Codecの検索結果1 - 15 件 / 15件

タグ検索の該当結果が少ないため、タイトル検索結果を表示しています。

Codecに関するエントリは15件あります。 google機械学習技術 などが関連タグです。 人気エントリには 『Neural Audio Codec を用いた大規模配信文字起こしシステムの構築 - Mirrativ Tech Blog』などがあります。
  • Neural Audio Codec を用いた大規模配信文字起こしシステムの構築 - Mirrativ Tech Blog

    こんにちは ハタ です。 最近Mirrativ上に構築した配信の文字起こしシステムを紹介したいなと思います 音声からの文字起こしは、各社SaaSでAPI提供されているものがあると思いますが、今回紹介するものはセルフホスト型(自前のGPUマシンを使う)になります 構築していく上で色々試行錯誤したのでそれが紹介できればなと思っています どんなものを作ったか 前提知識: 配信基盤 前提知識: Unix Domain Socket Live Recorder Archiver DS Filter VAD Filter NAC / Compress Transcriber NAC / Decompress Speach To Text コンテナイメージ まとめ We are hiring! どんなものを作ったか 今回作ったものは Mirrativで配信されるすべての音声を対象に文字起こしを行う シス

      Neural Audio Codec を用いた大規模配信文字起こしシステムの構築 - Mirrativ Tech Blog
    • Lyra: A New Very Low-Bitrate Codec for Speech Compression

      Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more

        Lyra: A New Very Low-Bitrate Codec for Speech Compression
      • SoundStream: An End-to-End Neural Audio Codec

        Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more

          SoundStream: An End-to-End Neural Audio Codec
        • Lyra V2 - a better, faster, and more versatile speech codec

          The latest news from Google on open source releases, major projects, events, and student outreach programs. Since we open sourced the first version of Lyra on GitHub last year, we are delighted to see a vibrant community growing around it, with thousands of stars, hundreds of forks, and many comments and pull requests. There are people who fixed and formatted our code, built continuous integration

            Lyra V2 - a better, faster, and more versatile speech codec
          • GitHub - google/lyra: A Very Low-Bitrate Codec for Speech Compression

            The basic architecture of the Lyra codec is quite simple. Features are extracted from speech every 20ms and are then compressed for transmission at a desired bitrate between 3.2kbps and 9.2kbps. On the other end, a generative model uses those features to recreate the speech signal. Lyra harnesses the power of new natural-sounding generative models to maintain the low bitrate of parametric codecs w

              GitHub - google/lyra: A Very Low-Bitrate Codec for Speech Compression
            • It’s High Time to Replace JPEG With a Next-Generation Image Codec

              I can be quite passionate about image codecs. A “codec battle” is brewing, and I’m not the only one to have opinions about that. Obviously, as the chair of the JPEG XL ad hoc group in the JPEG Committee, I’m firmly in the camp of the codec I’ve been working on for years. Here in this post, however, I’ll strive to be fair and neutral. The objective is clear: dethroning JPEG, the wise old Grandmaste

                It’s High Time to Replace JPEG With a Next-Generation Image Codec
              • Hello, Video Codec!

                It can't be overstated how crucial video codecs are to the products we use every day. Without them, we wouldn't be able to watch videos on YouTube or meet remotely via Zoom. But how do they work? In this post, we'll explore at a high level the key concepts and defining characteristics of video codecs. Then, to further demystify them, we'll even implement one from scratch in about a hundred lines o

                  Hello, Video Codec!
                • Lyra: A New Very Low-Bitrate Codec for Speech Compression

                  Philosophy We strive to create an environment conducive to many different types of research across many different time scales and levels of risk. Learn more about our Philosophy Learn more

                    Lyra: A New Very Low-Bitrate Codec for Speech Compression
                  • GitHub - facebookresearch/encodec: State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.

                    You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                      GitHub - facebookresearch/encodec: State-of-the-art deep learning based audio codec supporting both mono 24 kHz audio and stereo 48 kHz audio.
                    • UnicodeDecodeError: 'shift_jis' codec can't decode byteとなった時の対応方法 - Qiita

                      環境 Windows10 Pro バージョン1909 Python 3.8.5 Pandas 1.0.5 事象 : CSVファイルをPandasで読み込んだら怒られた Traceback (most recent call last): File "C:/path/to/my_code.py", line 258, in <module> csv = read_files(target_dir) File "C:/path/to/my_code.py", line 74, in read_files data = pd.read_csv(file, encoding="shift_jis") File "C:\path\to\venv\lib\site-packages\pandas\io\parsers.py", line 676, in parser_f return _read(fi

                        UnicodeDecodeError: 'shift_jis' codec can't decode byteとなった時の対応方法 - Qiita
                      • How to Check Which Bluetooth A2DP Audio Codec Is Used on Windows • Helge Klein

                        by: Helge, published: Sep 9, 2020, updated: Sep 16, 2020, in Windows General This is a follow-up to my earlier article Bluetooth Audio Quality & aptX on Windows 10, based on a comment by reader eluxe. Windows makes it unnecessarily hard to identify the audio codec used by the Bluetooth A2DP profile, but there is a way. This post shows how to check if your connection makes use of aptX, LDAC, or som

                          How to Check Which Bluetooth A2DP Audio Codec Is Used on Windows • Helge Klein
                        • Why Rust is not a mature programming language « Kostya's Boring Codec World

                          While I have nothing against Rust as such and keep writing my pet project in Rust, there are still some deficiencies I find preventing Rust from being a proper programming language. Here I’d like to present them and explain why I deem them as such even if not all of them have any impact on me. Rust language problems First and foremost, Rust does not have a formal language specification and by that

                          • MLow: Meta’s low bitrate audio codec

                            At Meta, we support real-time communication (RTC) for billions of people through our apps, including WhatsApp, Instagram, and Messenger. We are working to make RTC accessible by providing a high-quality experience for everyone – even those who might not have the fastest connections or the latest phones. As more and more people have relied on our products to make calls over the years, we’ve been wo

                              MLow: Meta’s low bitrate audio codec
                            • Web video codec guide - Web media technologies | MDN

                              This guide introduces the video codecs you're most likely to encounter or consider using on the web, summaries of their capabilities and any compatibility and utility concerns, and advice to help you choose the right codec for your project's video. Due to the sheer size of uncompressed video data, it's necessary to compress it significantly in order to store it, let alone transmit it over a networ

                                Web video codec guide - Web media technologies | MDN
                              • GitHub - facebookresearch/audio2photoreal: Code and dataset for photorealistic Codec Avatars driven from audio

                                You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session. You switched accounts on another tab or window. Reload to refresh your session. Dismiss alert

                                  GitHub - facebookresearch/audio2photoreal: Code and dataset for photorealistic Codec Avatars driven from audio
                                1

                                新着記事