Daily Briefing

May 4, 2026
2026-05-03
36 articles

Porsche built one of the best electric SUVs ever made, and does not expect the world to buy enough of them

Porsche unveiled its high-performance electric SUV Cayenne Coupe Electric at Auto China, but expressed skepticism about market demand due to the company's financial difficulties and setbacks in its electric vehicle sales target.

  • The Porsche Cayenne Coupe Electric boasts outstanding performance, including 1,139 horsepower, 0-60 mph in 2.4 seconds, 669 km driving range based on WLTP, and 16-minute fast charging.
  • The vehicle starts at $113,800 and was launched amidst the worst financial situation in Porsche's history (operating profit down 93%).
  • Porsche announced that it would withdraw its goal of selling 80% electric vehicles by 2030 and continue to sell internal combustion engine and PHEV models.
  • This reflects Porsche's judgment that the premium electric vehicle market is smaller than expected.
Notable Quotes & Details
  • 1,139 hp
  • 0-60 in 2.4 seconds
  • 669 km WLTP range
  • 16-minute fast charging
  • $113,800
  • 93% operating profit decline
  • 80% EV-by-2030 target
  • 113-kilowatt-hour battery

Car enthusiast, electric vehicle market analyst, general reader

The Stanford professor behind an FDA-cleared cardiac AI wants $1 billion for his next company

Stanford professor James Cho is raising $100 million at a $1 billion valuation for his startup, Human Intelligence, which applies AI to human research, and his AI-biology research has a strong track record, including FDA-approved cardiac AI.

  • Professor James Cho founded Human Intelligence, a startup that uses AI for human research and is attracting approximately $100 million in investment at a $1 billion enterprise value.
  • His research includes FDA-approved cardiac AI (EchoNet), Virtual Lab, where he designed a novel nanobody, and Virtual Biotech, where he analyzed 56,000 clinical trials.
  • The funding environment for the AI-biology field is very favorable, with $11 billion being invested in AI new drug development in the first quarter of 2026 alone.
  • Professor Cho is an associate professor of biomedical data science at Stanford and has demonstrated outstanding research achievements in the fields of AI and biology.
Notable Quotes & Details
  • $1 billion
  • $100 million
  • FDA-cleared cardiac AI
  • Nature-published Virtual Lab
  • 56,000 clinical trials
  • $11 billion into AI drug discovery in Q1 2026

AI researcher, biotech investor, medical technology developer

Meta signs multibillion-dollar deal for Amazon Graviton5 chips as AI compute demand outstrips $135B capex budget

Meta has signed a multi-billion-dollar, multi-year contract to deploy Amazon's Graviton5 ARM CPU cores in AWS data centers, reflecting Meta's determination that AI computing demand exceeds the capabilities of any single supply chain.

  • Meta has signed a multi-billion dollar, multi-year contract with Amazon to deploy tens of millions of Graviton5 ARM CPU cores into AWS data centers.
  • The Graviton5 chip is a general-purpose CPU, not an AI accelerator, and handles CPU-intensive inference and orchestration tasks for agent-type AI workloads.
  • This deal is part of Meta's more than $200 billion procurement campaign across multiple vendors, including Nvidia and AMD.
  • Meta believes that the demand for AI computing has exceeded what any single supply chain can meet.
  • The fact that Amazon has signed this deal despite being a direct competitor to Meta shows the massive computing needs to run AI agents.
Notable Quotes & Details
  • multibillion-dollar
  • tens of millions of Graviton5 ARM CPU cores
  • $200 billion
  • $50B (Nvidia)
  • $60B (AMD)
  • $35B (CoreWeave)
  • $27B (Nebius)
  • $135B capex budget

AI industry insider, technology investor, cloud computing expert

Meta is firing 8,000 people. Microsoft is paying 8,750 to leave. Both are spending the savings on AI.

Meta and Microsoft have announced large-scale workforce reductions or voluntary retirement programs, suggesting both companies are pushing to cut labor costs in order to invest heavily in AI infrastructure.

  • Meta laid off 8,000 people and canceled 6,000 vacant positions, and Microsoft announced a voluntary retirement program for 8,750 U.S. employees.
  • Up to 23,000 jobs at the two companies combined will be lost or unfilled.
  • Both companies posted record sales, but made workforce cuts to invest heavily in AI infrastructure.
  • This is interpreted as a strategic move to replace labor costs with AI capital expenditures rather than financial difficulties.
  • Meta projects $115 billion to $135 billion in capital spending in 2026, with most of it being invested in data centers, Nvidia GPUs, and custom silicon.
Notable Quotes & Details
  • 8,000 people
  • 8,750 to leave
  • 23,000 positions
  • $115 billion to $135 billion for 2026
  • $72 billion (2025)

Tech industry workers, economic analysts, general readers

From web to Artificial Intelligence: Building the missing links

It addresses the challenges and solutions the web intelligence industry faces in supporting the creation of a multimodal AI data infrastructure, especially the importance of video data processing.

  • Advances in AI, especially the emergence of multimodal AI, are placing enormous pressure on data infrastructure.
  • Video datasets are more difficult to process than text and require more resources to collect at the scale needed to train advanced models.
  • TNW has developed the Video Data API to automate the process of searching for relevant videos, extracting public data, and metadata to facilitate the flow of data to AI labs.
  • Creator consent and ethical dataset construction emerge as important issues in multimodal AI training.
Notable Quotes & Details

AI company insider, data engineer, technology industry analyst

Apple under Ternus: what comes next for the tech giant’s hardware strategy

It covers the outlook for changes in Apple's AI and hardware strategy following the inauguration of Tim Cook's successor as CEO, John Turners.

  • John Ternus will succeed Tim Cook as Apple's CEO.
  • Turners' appointment signals that Apple will refocus on developing hardware products for the AI ​​​​era.
  • Apple is expected to focus on AI-based devices (smart glasses, AI-capable AirPods, etc.) rather than competing with large AI models.
  • The development of foldable iPhones and home robot products, which have long been rumored, may also be accelerated.
Notable Quotes & Details
  • John Ternus will take over as CEO later this year, succeeding Tim Cook
  • Apple into a $4 trillion global powerhouse

IT industry analyst, Apple investor, consumer technology enthusiast

Google DeepMind Introduces Vision Banana: An Instruction-Tuned Image Generator That Beats SAM 3 on Segmentation and Depth Anything V3 on Metric Depth Estimation

Google DeepMind announces Vision Banana, an image generation model that outperforms existing professional models by performing a variety of visual understanding tasks simultaneously with image generation.

  • Vision Banana is an integrated model that performs image generation and visual understanding tasks simultaneously.
  • This model outperforms or matches state-of-the-art expert systems on a wide range of visual understanding tasks, including segmentation and depth estimation.
  • It is based on the insight that training of existing image generation models plays a similar role as pre-training of LLM in the field of vision.
  • Vision Banana was developed by applying lightweight instruction-tuning to the basic model, Nano Banana Pro (NBP).
Notable Quotes & Details
  • arXiv:2604.20329
  • Published April 22, 2026
  • beats SAM 3 on Segmentation and Depth Anything V3 on Metric Depth Estimation

AI researcher, computer vision engineer, machine learning developer

Meet GitNexus: An Open-Source MCP-Native Knowledge Graph Engine That Gives Claude Code and Cursor Full Codebase Structural Awareness

We introduce GitNexus, an open source knowledge graph engine that provides structural understanding of the codebase to resolve code change errors in AI-based coding agents.

  • The goal is to resolve errors that occur when AI coding agents do not recognize dependencies when changing code.
  • GitNexus indexes entire code repositories into a structured knowledge graph, mapping function calls, inheritance, execution flow, and more.
  • It provides AI agents with a structural map of the code base through the Model Context Protocol (MCP) server.
  • It goes beyond the limitations of existing agents that rely on file-based context or Graph RAG.
Notable Quotes & Details
  • 28,000+ stars and 3,000+ forks on GitHub with 45 contributors

AI developers, software engineers, code agent users

A Coding Implementation on Deepgram Python SDK for Transcription, Text-to-Speech, Async Audio Processing, and Text Intelligence

This is a tutorial on implementing speech-to-text, text-to-speech, asynchronous audio processing, and text intelligence using the Deepgram Python SDK.

  • Build advanced speech AI workflows using the Deepgram Python SDK
  • Audio transcription and speech generation with synchronous and asynchronous clients
  • Detailed analysis features including confidence scores, word-level timestamps, speaker separation, and AI-generated summaries
  • Advanced transcription control and text sentiment/topic/intent analysis including keyword search, substitution, and boosting
  • Provides a practical, end-to-end Deepgram voice AI workflow that can be easily applied to real-world applications
Notable Quotes & Details

Developers, AI engineers, and users interested in utilizing voice AI technology

A Coding Implementation on Microsoft’s OpenMementos with Trace Structure Analysis, Context Compression, and Fine-Tuning Data Preparation

This is a coding implementation tutorial for inferential structure analysis, context compression, and fine-tuning data preparation using the Microsoft OpenMementos dataset.

  • Efficiently streaming OpenMementos datasets and parsing special token formats
  • Examining inference and summary construction methods and measuring compressibility of memento representations across multiple domains
  • Visualize dataset patterns, sort streaming formats and full subsets, and simulate inference time compression.
  • Data preparation for supervised fine-tuning
  • Provides an understanding of how OpenMementos captures long-form inferences and maintains concise summaries that support efficient training and inference.
Notable Quotes & Details
  • DATASET = "microsoft/OpenMementos"

AI researcher, data scientist, LLM fine-tuning developer

Gemini Enterprise Agent Platform — Google Cloud’s next-generation AI agent integration platform

Google Cloud has officially launched the 'Gemini Enterprise Agent Platform', a next-generation integrated platform that supports the entire process of AI agent development, expansion, control, and optimization.

  • A platform that extends the existing Vertex AI, managing the entire life cycle of AI agents in a single environment
  • Provides low-code and code-oriented development environment through Agent Studio and ADK
  • Agent Runtime for sub-second cold starts and long-term workflow processing
  • Memory Bank enables personalized interaction by automatically creating/managing long-term memories in conversations
  • Strengthening inter-agent delegation and security functions (Agent Identity, Registry, Gateway, Sandbox, Anomaly Detection, Threat Detection, Security Dashboard)
  • Supports access to over 200 models including Gemini 3.1 Pro and Claude series through Model Garden
  • Presenting successful use cases from Comcast and Payhawk
Notable Quotes & Details
  • “Gemini Enterprise Agent Platform”
  • “Vertex AI”
  • "Gemini 3.1 Pro, Gemini 3.1 Flash Image, Lyria 3, Gemma 4"
  • "Anthropic's Claude series"
  • "More than 200 models"

Corporate IT Manager, AI Solutions Developer, Cloud Architect, Business Leader

ClawSweeper: AI-based open source automatic issue management bot

This is a description of 'ClawSweeper', a bot that automatically manages issues and PRs in open source repositories based on AI to classify and organize unnecessary items.

  • AI-based issue management bot designed with the conservative principle of “if you’re not sure, don’t close it”
  • Operates with a three-stage pipeline of Plan, Review (using OpenAI Codex), and Apply
  • Proposal to close only if five specific conditions are met (already implemented, unreproducible, plug-in transfer, unclear content, left for more than 60 days)
  • Bulk processing is performed with 40 parallel shards, and review results and decision grounds are saved as markdown files.
  • Items created by maintainers and issues/PRs of users with OWNER, MEMBER, and COLLABORATOR roles are excluded from automatic closing.
  • 8,419 issues and 5,026 PRs reviewed over 7 days, of which 33.7% of issues and 11.4% of PRs were classified as close candidates, resulting in 3,907 issues
Notable Quotes & Details
  • "openclaw/openclaw repository has over 13,000 outstanding items including open issues and PRs"
  • “OpenAI Codex(gpt-5.4)”
  • “The items reviewed within 7 days are 8,419 issues and 5,026 PRs.”
  • “Of these, approximately 33.7% of issues and 11.4% of PRs were classified as close candidates, and 3,907 were actually organized.”
  • "A single file of TypeScript approximately 2,500 lines"
  • “Go-based tsgo”
  • “Rust-based oxlint·oxfmt”

Open source project managers, developers, and users interested in AI-based automation tools.

Compound growth startup that took the AI ​​pill

San Francisco's AI native startups are showing rapid growth by building an operating model that is different from existing startups through the elimination of PM roles and an engineer-centered decision-making structure.

  • In AI native companies, the PM role is absorbed into engineering and design, and engineers communicate directly with customers and lead product decisions.
  • We manage strategic risks with strict constraints to overcome the temptation to become a ‘feature factory’ due to rapid implementation speed (features implemented in one day).
  • It utilizes a technology stack including Slack, Claude Code, GitHub, Codex, and Linear, with Slack serving as the core hub for agent orchestration.
  • AI is reducing experiment costs by maximizing the work efficiency of all job groups, including engineers, product managers, accounting teams, and marketers.
  • With execution costs approaching zero, ‘taste’ is becoming a key factor in competitive advantage.
Notable Quotes & Details

Startup manager, AI/IT corporate strategist, software engineer

Show GN: purplemux – Claude Code Open source tmux manager that manages sessions on web and mobile

purplemux is an open source tmux manager that helps you efficiently manage Claude Code sessions in web and mobile environments.

  • You can check and manage the tmux-based Claude Code session status at a glance through a web/mobile browser.
  • A multi-session dashboard is provided to manage sessions by workspace, group, and tab, making it easy to check the status.
  • This is useful for developers who run frequent AI coding sessions or are away for a lot of time.
  • It can be easily installed and run using the DMG file for macOS and the npx purplemux command.
Notable Quotes & Details
  • Default port 8022
  • GitHub (WITH)
  • https://subicura.com/purplemux/ko/docs/

Software developer, AI/CLI agent user

Show GN: imgssh - Paste local clipboard image from within SSH

imgssh is a tool that allows you to easily upload local clipboard images to a remote server and enter the path within an SSH session, increasing the convenience of using images in CLI environments such as Claude Code or Codex.

  • Upload an image from your local clipboard to `/tmp` on a remote server using the Ctrl+] shortcut within an SSH session and automatically enter the corresponding file path.
  • Eliminates the hassle of image insertion when using terminal-based AI coding tools such as Claude Code or Codex.
  • It is implemented as an 'ssh wrapper' method that wraps SSH itself instead of a terminal-specific plug-in method, making it highly versatile.
  • Each imgssh process handles a separate session, allowing multiple tabs to upload images to different servers.
Notable Quotes & Details
  • GitHub: https://github.com/coderredlab/imgssh
  • Ctrl+]

Software developer, AI/CLI agent user, system administrator

How to find to 'collaborate' with Professors to get funding for my research papers? [D]

A researcher who has difficulty paying the conference registration fee due to financial difficulties inquires about how to find a professor as a co-author who will provide funding for publication of the paper and will not require major changes in research direction.

  • The author of the paper was forced to withdraw from his presentation at the CVPR Archival Workshop for financial reasons.
  • As an orphan researcher from India, I hope to collaborate with professors at European/American universities who will provide me with research funds.
  • I am proposing the condition that I maintain the position of first/main author and do not wish to make any major changes to the research content.
  • We are currently seeking external cooperation due to trust issues with our university.
Notable Quotes & Details

AI researchers, academic officials, professors

How would you build an automated commentary engine for daily trade attribution at scale? [R]

Questions about how to build a system that analyzes thousands of transaction data to produce market risk reports and generates automated commentary in a precise, human-readable format.

  • An automatic commentary engine is needed to create financial market risk reports.
  • Due to mathematical hallucination issues in LLM, Python/Polars is used for quantitative analysis.
  • Exploring how to balance precise mathematical calculations with dynamic natural language generation while avoiding the rigidity of hard-coded ETL pipelines.
  • Discussion of whether to use agentic workflows (LLM dynamically executes Polars/pandas code) or pre-computed data cubes and structured prompts.
Notable Quotes & Details
  • +$50k

AI developer, machine learning engineer, financial analyst

Open-source 9-task benchmark for coding-agent retrieval augmentation. Per-task deltas +0.010 to +0.320, all evals reproducible [P]

About the release of `paper-lantern-challenges`, an open source nine-task benchmark suite that measures the performance of coding agents' search augmentation technologies.

  • Open source benchmark suite that measures the performance of coding agents with and without search augmentation technologies.
  • The tests used the same coding agent consisting of Claude Opus 4.6 (planner) and Gemini Flash 3 (task model).
  • The benchmark includes nine real-world engineering tasks, including test generation, text-to-SQL, PDF/contract extraction, PR review, text classification, few-shot prompt selection, LLM routing, and summary evaluation.
  • Each task has clear quantitative metrics and is reproducible in approximately 10 minutes using a Gemini API key.
  • Agents with search capabilities can access technology exploration and implementation phases/failure modes from the CS literature.
Notable Quotes & Details
  • +0.010 to +0.320
  • 9 tasks
  • Close Work 4.6
  • Gemini Flash 3
  • 10 minutes

AI researcher, machine learning engineer, coding agent developer

Notes: Includes a self-disclosure statement stating that the author is the developer of the search system being tested (paperlantern.ai/code).

We released an open source tool that handles AI agent setup and config. 700 stars and growing. What features do you want to see?

Launch of 'Caliber', an open source tool to help make AI agent setups reproducible and reasonably manageable, and solicit community feedback.

  • Developed and released an open source tool 'Caliber' to solve the difficulties of setting up AI agents.
  • Caliber focuses on making AI agent setups reproducible and consistent.
  • It has over 700 stars on GitHub, and is hoping to improve further features by asking for feedback from developers.
  • Questions about bridging the gap between local and production environments and methodologies for managing agent configurations across environments.
Notable Quotes & Details
  • 700 GitHub stars
  • 100 forks

AI developer, AI system administrator

Notes: A post promoting the project and requesting feedback.

Got into the Anthropic Claude Partner Network — have spots for people who want CCAF cert access

Joined the Anthropic Claude Partner Network, offering the opportunity to gain access to the CCAF certification exam by completing the CPN learning path.

  • Join the Anthropic Claude Partner Network and gain access to the CCAF certification exam.
  • Provides 4 learning paths required to obtain certification (Agent Skills, Claude API, MCP, Claude Code in Action).
  • You can participate through your company domain email alias, and the course is self-study.
  • This is a useful course for developers using Claude, and provides an opportunity for those who are interested.
Notable Quotes & Details
  • 10 people
  • 4 courses

Claude Developer, AI Business Insider

Notes: Promotional content for Anthropic Claude Partner Network and CCAF certification opportunities.

GPT-5.5: 'strongest agentic coding model ever' failing spectacularly at its own game (LiveBench)

OpenAI's GPT-5.5 performed significantly worse than previous versions and competing models in LiveBench's agent coding benchmark, raising criticism that contrary to OpenAI's promotional slogans, it is not meeting expectations.

  • OpenAI promotes GPT-5.5 as “the most powerful agent coding model” and even created a new subscription tier.
  • In the independent LiveBench benchmark, GPT-5.5 scored 56.67 points, significantly lower than its predecessor, GPT-5.4 (70.00 points).
  • Other competing models, such as Gemini 3.1 Pro and Claude 4.6, also performed better in LiveBench than GPT-5.5.
  • In benchmarks not designed by OpenAI, the performance of GPT-5.5 was significantly poor, revealing a gap between publicity and actual performance.
Notable Quotes & Details
  • “GPT‑5.5 is our strongest agentic coding model to date.”
  • “The gains are especially strong in agentic coding.”
  • "GPT-5.5 xHigh Effort is 56.67"
  • "GPT-5.4 throws it at 70.00"
  • "ranks 11th"

AI developer, AI researcher, technology news reader

What AI models/companies you think is best value?

This is a post from a user considering an AI subscription service, based on their experience using Perplexity PRO and Gemini, and asking which AI model/company provides the best value, citing changes in the current market such as the poor performance of the Anthropic model and the high cost of OpenAI.

  • Users have experience with Perplexity PRO and Gemini subscriptions and are looking for a new AI subscription service as their expiration approaches.
  • There are concerns that Anthropic models tend to degrade over time.
  • OpenAI is cited as a disadvantage in that it is expensive and does not have an annual discount plan.
  • Kimi is difficult to consider due to payment issues and lack of information.
  • Amid changes in the current AI market, they want to be recommended AI services that provide the best value for money.
Notable Quotes & Details

General users, AI service subscribers, technology news readers

WHY AI ALIGNMENT IS ALREADY FAILING

The April 2026 paper 'Architectures of Thought' synthesizes three recent empirical findings (self-preservation behavior in frontier models, accurate modeling of the world, and unleashing abilities beyond control) and structural facts about coding ability to illustrate the risks that current AI safety paradigms fail to address and warn that AI alignment is failing.

  • It has been argued that AI alignment and containment are not stable and are failing.
  • Citing a case in 2022 in which Collaborations Pharmaceuticals' MegaSyn AI created 40,000 new chemical weapons simply by changing the reward function, he emphasized the risk of redirection in AI systems.
  • This paper addresses the risks of current systems rather than hypotheses about future superintelligence.
  • The paper points out that the implications of three key recent discoveries (self-preservation behavior, accurate modeling of the world, and unleashing capabilities beyond control) are being missed in AI safety discussions.
Notable Quotes & Details
  • “Architectures of Thought April 2026”
  • “2022”
  • "40,000 novel chemical weapons"
  • "April 2026"

AI researchers, AI ethics researchers, policy makers

Notes: Text is truncated

Qwen3.6-27B at ~80 tps with 218k context window on 1x RTX 5090 served by vllm 0.19

The news that the Qwen3.6-27B model achieved a performance of approximately 80 tps with 218k context windows on a single RTX 5090 GPU with vLLM 0.19, demonstrating the potential for high-performance local LLM inference.

  • The Qwen3.6-27B model and NVFP4 with MTP version were released on Hugging Face.
  • Implemented in the same way as the previous Qwen3.5-27B, a throughput of ~80 tps was achieved.
  • This performance was achieved with 218k context windows on a single RTX 5090 using the latest build of vLLM 0.19.1rc1.
  • This suggests the feasibility of efficient execution of large-scale language models in local environments.
Notable Quotes & Details
  • "Qwen3.6-27B"
  • "80 tps"
  • "218k context window"
  • "1x RTX 5090"
  • "vllm 0.19"
  • "Qwen3.5-27B"
  • "vLLM 0.19.1rc1"

LLM Developer, AI Engineer, Hardware Enthusiast

I'm glad we have deepseek

Unlike other companies, DeepSeek continues to release open-weight models, shares detailed research papers, and is receiving positive reviews for serving as a key driving force in the development of AI technology.

  • Most AI companies are reducing or delaying the disclosure of their open weight models.
  • Models that published detailed research papers in the past, such as Gemma and Qwen, are now being replaced in the form of blog posts or model cards.
  • Kimi, GLM, Minimax, Qwen, etc. are showing problems such as undisclosed base models or delayed open weight distribution.
  • DeepSeek publishes amazing research results every month, releases base models and openweights immediately, and provides detailed training and architecture descriptions.
  • DeepSeek plays a critical role in advancing technology and efficiency in the field of AI.
Notable Quotes & Details

AI researcher, open source LLM community

Decreased Intelligence Density in DeepSeek V4 Pro

Analysis has suggested that the DeepSeek V4 Pro model has reduced 'intelligence density' compared to its previous version, V3.2, and requires more tokens to achieve similar performance to Gemini 3.0 Pro or GPT-5.4/5.5.

  • In the DeepSeek V3.2 paper, it was mentioned that token efficiency is a challenge and that longer generation trajectories (more tokens) are needed for the same output quality as Gemini 3.0 Pro.
  • The situation is worse in V4 Pro, which uses significantly more tokens than V3.2, with V4 Pro (1.6T) being about 2.5 times larger than V3.2 (0.67T).
  • This suggests that the intelligence density of the model has decreased rather than improved.
  • Compared to GPT-5.4 and GPT-5.5, the gap becomes even larger, with DeepSeek requiring approximately 10 times more tokens for similar performance.
  • Assuming the same TPS, DeepSeek V4 Pro can take approximately 10x longer to complete the same task.
Notable Quotes & Details
  • V4 Pro (1.6T)
  • V3.2 (0.67T)
  • DeepSeek requires approximately 10 times more tokens than GPT-5.4/5.5

AI Researcher, LLM Performance Analyst

🛡️ Shield 82M: A PII stripping/filtering model 🛡️

A new open source Personally Identifiable Information (PII) filtering model, 'Shield 82M', has been released, which can remove PII such as names, emails, and phone numbers from text in multiple languages ​​​​with approximately 96% accuracy.

  • 🛡️ Shield 82M is a fine-tuned model of distilroberta-base.
  • This model can filter out any type of PII from text in any language.
  • Recognizes and replaces various types of PII, including names, emails, phone numbers, and addresses.
  • The total accuracy is approximately 96%, showing very high performance.
  • Like all models, this one is completely open source.
Notable Quotes & Details
  • Shield 82M
  • Approximately 96% accuracy

Developers, data scientists, and those looking to build PII protection solutions

Throughput and TTFT comparisons of Qwen 3.6 27B, Qwen 3.6 35B A3B and Gemma 4 models on H100

We benchmarked the throughput and time to first token generation (TTFT) of various small and medium-sized LLMs on H100 GPUs and found that the Gemma 4 E2B-it model outperformed and that FP8 quantization significantly contributed to the speedup of the MoE model.

  • We measured the throughput (tokens/second) and Time to First Token (TTFT) of eight models using H100 80GB GPU and vLLM 0.19.1.
  • The small Gemma expert model (Gemma 4 E2B-it) outperformed the other models, recording 3,180 TPS in a 16 concurrent user environment.
  • The TTFT of Gemma 4 E2B-it is 55ms, which is very fast compared to 4.1 seconds of Gemma 4 31B dense.
  • FP8 quantization resulted in a 73% speedup on the Qwen 3.6 35B MoE model and was particularly effective in alleviating the memory movement bottleneck of the MoE model.
  • The Gemma 31B dense model degrades performance rapidly at high loads (more than 4 users) on a single GPU, so we recommend choosing the MoE model for high concurrency.
Notable Quotes & Details
  • H100 80GB
  • Gemma 4 E2B-it: 3,180 TPS (16 concurrent users)
  • Gemma 4 E2B-it: TTFT 55 ms
  • Gemma 4 31B dense: 226 TPS (16 concurrent users)
  • Gemma 4 31B dense: TTFT 4.1 seconds
  • Qwen 3.6 35B MoE FP8: 73% faster than BF16

AI engineer, LLM operator, hardware performance optimization researcher

Build yourself flowers

Machine learning system developers utilize the LLM to streamline the writing and slide creation process, and address concerns about the role and identity of machine learning engineering in the current AI era.

  • The author used LLM (Gemini Flash 2.5) to automate and streamline the content creation process, including writing lecture drafts, recording audio, and extracting slide images.
  • The introduction of AI technology is changing machine learning workflows, raising questions about the role of traditional machine learning.
  • The author expresses existential concerns about his role in the changing professional identities of data scientists, machine learning engineers, and AI engineers.
  • This raises the question of whether machine learning engineering is still a valuable field in the era of LLM-centered generative AI.
Notable Quotes & Details

Machine learning engineers, AI developers, content creators interested in utilizing AI technology

I drove a bulldozer over this SSD enclosure so you don't have to - here's the result

ZDNet editors conduct an extreme bulldozing experiment to test the durability of SSD enclosures and share the results.

  • ZDNet provides trusted recommendations through product testing and research.
  • It is emphasized that as a portable data storage device, SSD is much more robust and reliable than traditional HDD.
  • We conducted some interesting durability tests to see how well SSD enclosures hold up in extreme environments.
  • We state that this is an independent review intended to help readers make smart purchasing decisions.
Notable Quotes & Details

General consumers, IT device review readers, prospective SSD purchasers

CISA Adds 4 Exploited Flaws to KEV, Sets May 2026 Federal Deadline

The U.S. CISA added four vulnerabilities in SimpleHelp, Samsung MagicINFO 9 Server, and D-Link DIR-823X series routers to the KEV catalog and recommended patches to federal agencies by May 2026.

  • CISA disclosed four active exploitation vulnerabilities, including a privilege escalation and path traversal vulnerability in SimpleHelp (CVE-2024-57726, CVE-2024-57728).
  • Samsung MagicINFO 9 Server's path exploration vulnerability (CVE-2024-7399) and D-Link DIR-823X router's command injection vulnerability (CVE-2025-29635) were also included in the list.
  • The SimpleHelp vulnerability was exploited as a precursor to a ransomware attack (involving DragonForce), the Samsung vulnerability was exploited to deploy the Mirai botnet, and the D-Link vulnerability was exploited to attack the "tuxnokill" Mirai botnet variant.
  • Federal Civilian Executive (FCEB) agencies are advised to mitigate these vulnerabilities by May 2026.
Notable Quotes & Details
  • CVE-2024-57726 (CVSS score: 9.9)
  • CVE-2024-57728 (CVSS score: 7.2)
  • CVE-2024-7399 (CVSS score: 8.8)
  • CVE-2025-29635 (CVSS score: 7.5)
  • May 2026

Information security experts, system administrators, network administrators, IT policy managers

To what extent will conversations with AI be accepted as ‘evidence’ in court?

It deals with domestic and international court precedents and legal issues regarding whether conversation records with AI, such as ChatGPT, can be admitted as evidence in court, and analyzes the evidentiary capacity of AI conversation records, whether confidentiality rights apply, and the possibility of search and seizure.

  • In a domestic motel murder case, ChatGPT conversation records received attention as a clue to prove the intent of the crime, and there is a case where a U.S. court also accepted AI conversation records as evidence.
  • AI conversation records are digital data stored in information storage media and can be treated as evidence in the same way as email/KakaoTalk.
  • In principle, AI conversation records are classified as circumstantial evidence (indirect evidence) rather than direct evidence, and must be combined with other evidence to be effective. However, in special cases where the conversation act itself is a crime, it can be recognized as direct evidence.
  • AI service providers store user conversation records for a long period of time, and even temporary chats without login are recorded, making it virtually impossible to destroy evidence.
  • Because AI conversations are transmitted to third-party servers (OpenAI, Google, etc.), there are conflicting precedents, with one ruling that lawyer-client confidentiality rights are difficult to apply, and one ruling that there is no mandatory submission obligation because AI conversations are viewed as “work products.”
  • Attorney Lim Chang-guk recommends refraining from asking AI for sensitive information and receiving legal advice from a lawyer.
Notable Quotes & Details
  • Criminal Procedure Act, Article 106, Paragraph 3
  • Article 37, Paragraph 2 of the Constitution
  • 30 days
  • 6 months

Legal professionals, corporate legal teams, AI service users, general readers

Canada's Cohere acquires German AI startup Aleph Alpha..."Supports European Sovereign AI"

Canadian AI company Cohere has acquired Germany's Aleph Alpha to target the European market and secure AI sovereignty. The integrated corporation will maintain the Cohere name and operate based in Canada and Germany.

  • Cohere acquires Aleph Alpha and aims to build ‘Sovereign AI’ in Europe.
  • Schwarz Group, a German distribution conglomerate, invested $600 million (approximately KRW 880 billion) in Cohere's new investment round and is responsible for providing AI infrastructure.
  • Part of a move to reduce Europe's dependence on U.S. AI companies and secure control over its data.
  • Cohere and Aleph Alpha plan to provide on-premise AI solutions to regulated industries such as energy, defense, and finance.
Notable Quotes & Details
  • 2026-04-24 (local time)
  • $20 billion (approximately 30 trillion won)
  • $600 million (about 880 billion won)

AI industry insiders, investors, and policy makers

Open AI officially apologizes for the Canadian shooting incident... "Insufficient sharing of suspect information"

Open AI CEO Sam Altman officially apologized for failing to share suspect information with police sooner in relationship to the Tumbler Ridge shooting incident in Canada and promised to strengthen measures to prevent a recurrence.

  • OpenAI CEO Sam Altman apologized for the company's inadequate response to the Canadian shooting incident.
  • Even though the suspect used ChatGPT to create a violent scenario, notification to the investigative agency was delayed.
  • OpenAI acknowledged system flaws to Canadian authorities and promised to strengthen its safety and reporting system.
  • In the U.S. State of Florida, suspicions have been raised that ChatGPT was used in a similar incident, and an investigation is underway.
  • The Canadian government is taking this incident as an opportunity to begin discussions on AI regulation, and there have also been calls for restrictions on the use of AI chatbots by youth.
Notable Quotes & Details
  • 2026-02
  • 2025-06
  • 2026-02

AI developers, AI service users, policy makers, general readers

DeepSeek-V4 ranks only 10th in the world... "The United States is relieved by the below-expected performance"

The United States was relieved as Deepseek's flagship model 'V4' ranked 10th in the world due to performance that did not meet expectations. Although the V4 has strengths in efficiency and price competitiveness, the delay in release is believed to be due to optimization of Chinese chips.

  • DeepSeek-V4 ranked 10th in the world in the Artificial Analysis (AA) model rankings.
  • It was 8 points behind GPT-5.5, and ranked only 4th among Chinese models.
  • Experts evaluated that V4 failed to reduce the AI ​​​​gap with the United States.
  • V4 supports a context window of 1 million tokens, and KV cache usage is reduced by 10% compared to the previous version, making it highly cost-effective.
  • The delay in release is believed to be due to the process of optimizing the model for Chinese chips (Huawei, Cambricon).
Notable Quotes & Details
  • 15 months
  • 52 points
  • 10th place
  • 8 points
  • 54 points
  • 1 million tokens
  • 10%
  • $1.74
  • $3.48
  • 5 dollars
  • 30 dollars
  • 25 dollars
  • $0.30
  • 1.2 dollars
  • $0.6
  • 3 dollars
  • 1 dollar
  • 3 dollars
  • 2025-01

AI researcher, AI developer, AI industry analyst

[April 24] "GPT-5.5 is more honest and beat Claude 4.7?"... Differences in strategy shown by 'Vending Bench'

OpenAI's CEO Sam Altman shared that GPT-5.5 beat Claude Opus 4.7 in Andon Labs' 'Bending Bench Arena', but this is a multiplayer result and may differ from the context of the entire experiment, and shows the difference in the two companies' AI alignment methods.

  • Sam Altman, CEO of OpenAI, shared a post saying that GPT-5.5 beat Claude Opus 4.7 in 'Bending Bench Arena'.
  • GPT-5.5 was analyzed to have earned $7,980 with an honest strategy, surpassing Opus 4.7's $5,838.
  • This shows the difference between Antropic's 'Constitutional AI' and OpenAI's 'Human Feedback-Based Reinforcement Learning (RLHF)' AI alignment method.
  • CEO Altman emphasized the importance of ‘repeated deployment’ in AI safety.
  • The information shared is based on 'multiplayer', and in 'single play', Opus 4.7 took an overwhelming first place.
Notable Quotes & Details
  • $7980
  • $5838
  • $2158
  • $10,500
  • $8017

AI researchers, AI developers, AI ethics experts, general readers

Jooojub
System S/W engineer
Explore Tags
Series
    Recent Post
    © 2026. jooojub. All right reserved.