Daily Briefing

April 20, 2026
2026-04-19
41 articles

A humanoid robot just beat the human half-marathon world record by seven minutes in Beijing

A humanoid robot called 'Lightning' completed the Beijing half marathon in 50 minutes 26 seconds, breaking the human world record by nearly 7 minutes.

  • The Lightning robot autonomously completed the 21km course using sensor fusion and real-time decision-making algorithms.
  • A remotely controlled Lightning robot achieved an even faster time of 48 minutes 19 seconds.
  • The time is approximately 7 minutes faster than the human world record of 57 minutes 20 seconds.
  • The robot stands 169cm tall, has a 95cm effective leg length, 400Nm peak torque, and a proprietary liquid cooling system.
  • Last year's event saw only 6 of 21 robots finish, but this year more than 300 robots from 26 brands participated and the event was held successfully.
Notable Quotes & Details
  • 50분 26초
  • 7분
  • 48분 19초
  • 57분 20초
  • 2026-04-19
  • 21km

General readers, robotics researchers, AI technology enthusiasts

Trump wants to stop states from regulating AI. States and Congress keep saying no.

The Trump administration is attempting to limit state governments' authority over AI regulation, but states and Congress are pushing back and pursuing their own AI legislation.

  • The Trump administration is trying to block state-level AI regulation through a DOJ task force, Commerce Department assessments, and proposing a legislative framework to Congress.
  • In 2025, 1,208 AI-related bills were introduced by state governments, and 145 were enacted.
  • Congress has rejected AI regulation preemption twice, and a Utah AI transparency bill failed to pass due to White House opposition.
  • Utah Representative Doug Fiefia has been vocal in defending states' rights and critical of the White House's actions.
  • Fiefia's bill targeted 'frontier developers' and included requirements to disclose safety and child protection plans, as well as whistleblower protections.
Notable Quotes & Details
  • 1,208 AI bills
  • 145 enacted
  • 99-1 Senate vote
  • House Bill 286
  • 2026-04-19
  • 10^26 floating-point operations
  • $1 million penalty cap

Policymakers, legal professionals, AI industry stakeholders, general readers

Google is in talks with Marvell to build custom AI inference chips as it diversifies beyond Broadcom

Google is in discussions with Marvell Technology to develop new AI inference chips beyond its existing Broadcom partnership, as part of a strategy to diversify its custom semiconductor supply chain.

  • Google is discussing the development of memory processing units and inference-optimized TPUs with Marvell.
  • This would add a third design partner alongside Broadcom and MediaTek, with the goal of diversifying the supply chain.
  • The custom ASIC market is expected to grow 45% in 2026 and reach $118 billion by 2033.
  • Google recently announced its 7th-generation TPU 'Ironwood,' the first TPU tailored for the inference era, offering 10x peak performance over TPU v5p.
  • The chips Marvell would design are expected to complement Ironwood.
Notable Quotes & Details
  • 45%
  • 2026
  • $118 billion
  • 2033
  • 2031
  • 10x
  • 9,216
  • 10 megawatts
  • 42.5 FP8 exaflops
  • 2026-04-19

Semiconductor industry professionals, AI technology developers, investors, corporate strategists

Stanford's AI Index finds China has nearly closed the performance gap with the US despite spending 23 times less

According to the Stanford AI Index 2026 report, China has narrowed the AI model performance gap with the US to just 2.7%, despite spending only 1/23rd as much on private AI investment.

  • The AI model performance gap, which was 17.5–31.6 percentage points in May 2023, has shrunk significantly to 2.7% as of March 2026.
  • The US invested $285.9 billion privately in AI in 2025, compared to China's $12.4 billion — a 23x difference.
  • China leads the US in AI patents (69.7% global share), publications (23.2%), industrial robot installations (9x the US), and energy infrastructure.
  • The rate of AI talent migrating to the US has dropped 89% since 2017.
  • Anthropic's Claude Opus 4.6 maintains the lead, but Chinese models such as ByteDance's Dola-Seed-2.0-Preview are catching up rapidly.
Notable Quotes & Details
  • 2.7%
  • 17.5-31.6 percentage points
  • 23 times less
  • $285.9 billion
  • $12.4 billion
  • 69.7%
  • 23.2%
  • 9x
  • 89%
  • 2017
  • 2026-04-19
  • Claude Opus 4.6
  • Dola-Seed-2.0-Preview
  • 1,503
  • 1,464

AI researchers, policymakers, technology investors, international relations analysts

Threads is redesigning its website and finally adding direct messages to the desktop

Threads is overhauling its desktop web interface and adding direct messaging functionality.

  • The new web interface offers direct messages, a navigation sidebar, and a cleaner single-feed layout.
  • The DM feature, launched on mobile in June 2025, is expected to arrive on the web 'within the next few weeks.'
  • This will bring one-on-one chat, group conversations of up to 50 people, and media sharing to the most active desktop users.
  • Threads has surpassed 450 million monthly active users and is expanding its global advertising business.
  • The new design is similar to the desktop layout of X (formerly Twitter).
Notable Quotes & Details
  • 450 million monthly active users
  • June 2025 (mobile DM launch)

Threads users, social media platform developers, digital marketers

OpenAI's existential questions

An analysis of how OpenAI is attempting to address two key 'existential questions' it faces through recent acquisitions.

  • OpenAI acquired personal finance startup Hiro and media startup TBPN.
  • The Hiro acquisition aims to develop paid products with 'more hooks' beyond just chatbots.
  • The TBPN acquisition appears to be an effort to improve OpenAI's public image.
  • The TechCrunch Equity podcast discussed OpenAI's recent news and strategic direction.
  • Some observers argue OpenAI should focus on strengthening its enterprise market competitiveness.
Notable Quotes & Details

AI industry investors, OpenAI competitors, AI policy analysts, tech startup founders

Notes: Incomplete content (podcast preview)

The 12-month window

AI investors suggest that startups should regularly discuss exit strategies in order to capture the '12-month window' when their valuation peaks.

  • AI investor Elad Gil noted that most companies have 'roughly a 12-month period when value peaks.'
  • He emphasized that companies that sold during this window achieved great success (e.g., Lotus, AOL, Broadcast.com).
  • He offered the practical suggestion of holding board meetings 1–2 times a year specifically to discuss exit strategies.
  • He warned that many AI startups currently exist in spaces not yet entered by large foundation models, and that this situation will not last forever.
  • Founder concerns about large models entering their markets are growing, as illustrated by the example of Deel CEO Alex Bouaziz.
Notable Quotes & Details
  • A 12-month window

AI startup founders, venture capitalists, M&A strategists

Cloud development platform Vercel was hacked

Cloud development platform Vercel was hacked via a breach of a 'third-party AI tool,' and hackers are attempting to sell stolen data.

  • Vercel disclosed that the hack occurred through a breach of a 'third-party AI tool.'
  • An individual claiming to be a member of ShinyHunters posted some data online, including employee names, email addresses, and activity timestamps.
  • Vercel confirmed the impact was limited to 'a limited number of customers' and advised them to review activity logs and rotate environment variables.
  • The attack may be part of a broader breach targeting Google Workspace OAuth apps.
  • Vercel published indicators of compromise (IOCs) for the community and urged Google Workspace admins to immediately check whether the affected app is in use.
Notable Quotes & Details

Vercel users, cloud developers, information security professionals, Google Workspace administrators

Meet OpenMythos: An Open-Source PyTorch Reconstruction of Claude Mythos Where 770M Parameters Match a 1.3B Transformer

Anthropic has never published a technical paper on Claude Mythos.

  • Anthropic has never published a technical paper on Claude Mythos.
  • That has not stopped the research community from theorizing.
  • A new open-source project called OpenMythos , released on GitHub by Kye Gomez , attempts something ambitious: a first-principles theoretical reconstruction of what the Claude Mythos architecture might actually be, built entirely in PyTorch and grounded in peer-reviewed research.
  • The project is not a leaked model, a fine-tune, or a distillation.
  • It is a hypothesis rendered in code — and the hypothesis is specific enough to be falsifiable, which is what makes it interesting.
Notable Quotes & Details

General readers

How TabPFN Leverages In-Context Learning to Achieve Superior Accuracy on Tabular Datasets Compared to Random Forest and CatBoost

Tabular data—structured information stored in rows and columns—is at the heart of most real-world machine learning problems, from healthcare records to financial transactions.

  • Tabular data—structured information stored in rows and columns—is at the heart of most real-world machine learning problems, from healthcare records to financial transactions.
  • Over the years, models based on decision trees, such as Random Forest , XGBoost , and CatBoost , have become the default choice for these tasks.
  • Their strength lies in handling mixed data types, capturing complex feature interactions, and delivering strong performance without heavy preprocessing.
  • While deep learning has transformed areas like computer vision and natural language processing, it has historically struggled to consistently outperform these tree-based approaches on tabular datasets.
  • That long-standing trend is now being questioned.
Notable Quotes & Details

General readers

A Coding Implementation to Build an AI-Powered File Type Detection and Security Analysis Pipeline with Magika and OpenAI

In this tutorial, we build a workflow that combines Magika's deep-learning-based file type detection with OpenAI's language intelligence to create a practical and insightful analysis pipeline.

  • In this tutorial, we build a workflow that combines Magika's deep-learning-based file type detection with OpenAI's language intelligence to create a practical and insightful analysis pipeline.
  • We begin by setting up the required libraries, securely connecting to the OpenAI API, and initializing Magika to classify files directly from raw bytes rather than relying on filenames or extensions.
  • As we move through the tutorial, we explore batch scanning, confidence modes, spoofed-file detection, forensic-style analysis, upload-pipeline risk scoring, and structured JSON reporting.
  • At each stage, we use GPT to translate technical scan outputs into clear explanations, security insights, and executive-level summaries, allowing us to connect low-level byte detection with meaningful real-world interpretation.
  • Copy Code Copied Use a different Browser !pip install magika openai -q import os, io, json, zipfile, textwrap, hashlib, tempfile, getpass from pathlib import Path from collections import Counter from magika import Magika from magika.types import MagikaResult, PredictionMode from openai import OpenAI print(" Enter your OpenAI API key (input is hidden):") api_key = getpass.getpass("OpenAI API Key: ") client = OpenAI(api_key=api_key) try: client.models.list() print(" OpenAI connected successfully\n") except Exception as e: raise SystemExit(f" OpenAI connection failed: {e}") m = Magika() print(" Magika loaded successfully\n") print(f" module version : {m.get_module_version()}") print(f" model name : {m.get_model_name()}") print(f" output types : {len(m.get_output_content_types())} supported labels\n") def ask_gpt(system: str, user: str, model: str = "gpt-4o", max_tokens: int = 600) -> str: resp = client.chat.completions.create( model=model, max_tokens=max_tokens, messages=[ {"role": "system", "content": system}, {"role": "user", "content": user}, ], ) return resp.choices[0].message.content.strip() print("=" * 60) print("SECTION 1 — Core API + GPT Plain-Language Explanation") ...[truncated]
Notable Quotes & Details

General readers

NVIDIA Releases Ising: the First Open Quantum AI Model Family for Hybrid Quantum-Classical Systems

Quantum computing has spent years living in the future tense.

  • Quantum computing has spent years living in the future tense.
  • Hardware has improved, research has compounded, and venture dollars have followed — but the gap between a quantum processor running in a lab and one running a real-world application remains stubbornly wide.
  • NVIDIA moved to close that gap with the launch of NVIDIA Ising , the world's first family of open quantum AI models specifically designed to help researchers and enterprises build quantum processors capable of running useful applications.
  • Here's the core problem Ising is designed to solve: quantum computers are extraordinarily sensitive.
  • Their fundamental unit of computation, the qubit , is so easily disturbed by environmental noise that errors accumulate rapidly during computation.
Notable Quotes & Details

AI researchers

A Coding Tutorial for Running PrismML Bonsai 1-Bit LLM on CUDA with GGUF, Benchmarking, Chat, JSON, and RAG

This tutorial covers how to efficiently run the Bonsai 1-bit large language model on a CUDA GPU using PrismML's GGUF deployment stack.

  • Demonstrates how to run the Bonsai 1-bit LLM with GPU acceleration using the GGUF deployment stack and CUDA.
  • Explains how 1-bit quantization works and the memory efficiency of the Q1_0_g128 format.
  • Tests core inference, benchmarking, multi-turn chat, structured JSON generation, code generation, and OpenAI-compatible server mode.
  • Covers a small retrieval-augmented generation (RAG) workflow to demonstrate how Bonsai operates in real-world environments.
Notable Quotes & Details
  • Bonsai-1.7B

AI developers, researchers, engineers interested in LLM optimization

How to Save Tokens by Changing Claude Code and Codex Settings

Explains how to change settings in Claude Code and Codex to improve token usage efficiency and reduce costs.

  • A token usage increase issue arose in Claude Opus 4.7, highlighting the importance of token efficiency.
  • By analyzing the official documentation and source code of Claude Code and Codex, token-efficient settings were discovered.
  • Introduces various environment variables and flags that improve token efficiency, boot speed, and stability in non-interactive mode.
  • Specific optimizations are suggested, such as setting output limits, disabling automatic memory, and excluding built-in sub-agents and skills.
Notable Quotes & Details
  • Opus 4.7
  • Opus 4.6
  • Consumes nearly 1.5x more tokens

AI developers, Claude Code and Codex users, engineers interested in token cost optimization

NASA Shuts Down Instrument to Keep Voyager 1 Running

NASA has shut down the Low Energy Charged Particle (LECP) instrument on Voyager 1 to extend the spacecraft's operational life, as part of power-saving measures to keep it running.

  • The LECP instrument was shut down as Voyager 1's power shortage worsened.
  • The LECP had been in continuous operation for approximately 49 years since launch in 1977, collecting critical data from beyond the heliosphere.
  • The measure freed up approximately one year's worth of reserve power, and future power-saving plans may allow for extended operations and possibly reactivating the LECP.
  • Currently, two instruments remain operational on Voyager 1: the plasma wave receiver and the magnetometer.
Notable Quotes & Details
  • 1977
  • 49 years
  • February 27
  • April 17
  • More than 15 billion miles (25 billion kilometers)
  • 23 hours
  • 3 hours 15 minutes
  • 0.5 watts
  • 1 year

Space science researchers, astronomy enthusiasts, general readers

PanicLock - A Utility That Disables Touch ID and Forces Password-Only Unlock When Closing the MacBook Lid

PanicLock is a macOS menu bar utility that enhances security by closing the MacBook lid or using a shortcut to disable Touch ID and force password-only unlock.

  • A macOS utility that temporarily disables Touch ID and forces password-only unlock.
  • Addresses a security vulnerability caused by macOS lacking a built-in option to instantly disable Touch ID.
  • With the Lock on Close option, closing the MacBook lid automatically disables Touch ID and locks the screen.
  • Internally uses SMJobBless privileged helper, bioutil, and pmset commands; open-source with no network activity or data collection.
  • Gives users the option to force password entry as a safeguard against legally compelled unlocking.
Notable Quotes & Details
  • macOS
  • Touch ID
  • iOS 14.5
  • Washington Post reporter Hannah Natanson

MacBook users, security-conscious individuals, developers

OpenMythos: An Open-Source Implementation Reverse-Engineering Claude Mythos

An open-source implementation called OpenMythos has been released, hypothesizing Anthropic's Claude Mythos architecture as a form of 'iterative thinking transformer.'

  • OpenMythos is an open-source project that hypothesizes Claude Mythos's architecture and implements it as a 'transformer that thinks iteratively.'
  • Unlike conventional LLMs, instead of scaling the model up, it performs deep reasoning by running the same structure multiple times in a loop.
  • Internal iterative processing reduces token usage to address cost concerns, and suggests improving performance through increased reasoning-step computation rather than increasing parameter count.
  • There is no guarantee that it matches the actual Claude Mythos architecture or verified performance.
  • This hints at the design direction for next-generation LLMs.
Notable Quotes & Details

AI researchers, developers, open-source community

Show GN: Sharing a Korean Edition of the Claude Code Harness for Studying AI Coding Agent Architecture

A Korean translation and summary of the 'Claude Code Harness' has been shared, helping Korean readers understand the architecture and control perspective of the AI coding agent Claude Code.

  • A Korean translation of Claude Code agent architecture, covering 7 parts and 45 documents on topics including prompt engineering and context management.
  • Useful for those who want to understand AI coding agents not just as users, but from an architectural and control perspective.
  • Covers advanced topics such as harness engineering, agent design, permission boundaries, token budgets, and caching strategies.
  • There have been criticisms of translation quality, which may make some sections difficult to read.
Notable Quotes & Details

AI developers, AI researchers, AI agent architects

Notes: Translation quality is incomplete

1,200 ICLR 2026 Papers with Public Code or Data [R]

A list of approximately 1,200 papers accepted at ICLR 2026 with associated public code or data has been released.

  • Of the 5,300+ papers accepted at ICLR 2026, approximately 1,200 (about 22%) include links to public code, data, or demo.
  • Links were extracted directly from paper submissions and code repositories may not be made public before the conference begins.
  • ICLR 2026 is scheduled to be held in Rio de Janeiro, Brazil, starting April 22, 2026.
Notable Quotes & Details
  • ~1,200 ICLR 2026 accepted papers
  • approximately 22% of the 5,300+ accepted papers
  • ICLR 2026 will be in Rio de Janeiro, Brazil, starting April 22nd 2026

AI researchers, machine learning developers, academic researchers

On the path towards a true science of deep learning [D]

A scientist shared their views on approximately 7 years of effort toward establishing a fundamental scientific theory for machine learning.

  • A scientist with dual affiliation in industry and academia shared their thoughts on the path toward a true science of deep learning.
  • They have spent approximately 7 years researching to establish a foundational scientific theory for machine learning.
  • They offer personal insights on building the scientific foundations of deep learning.
Notable Quotes & Details
  • ~7y now

AI researchers, machine learning scholars, deep learning theorists

KDD 2026 Cycle 2 reviews seem to have vanished from author view [D]

An inquiry and discussion about KDD 2026 Cycle 2 paper reviews disappearing from the author view.

  • Reviews and discussions for papers submitted to KDD 2026 Cycle 2 have disappeared from the author view.
  • Discussions for other papers are still visible from the reviewer view.
  • The author asks whether others have had a similar experience.
Notable Quotes & Details

AI researchers, KDD conference participants

What are the future prospects of Spiking Neural Networks (and particularly, neuromorphics computing) and Liquid Neural Networks? [D]

A question requesting discussion on the future prospects of Spiking Neural Networks and Liquid Neural Networks.

  • An undergraduate student expresses interest in Spiking Neural Networks and Liquid Neural Networks.
  • Questions about why these networks have not been widely adopted and what their future prospects are.
  • Wonders whether it is worth learning about and working on projects related to these technologies.
Notable Quotes & Details

AI researchers, machine learning undergraduates

Converting XQuery to SQL with Local LLMs: Do I Need Fine-Tuning or a Better Approach? [P]

A question about whether fine-tuning or a better approach is needed when performing XQuery-to-SQL conversion with a local LLM.

  • Enterprise environment constraints requiring XQuery-to-SQL conversion to be performed with a local LLM.
  • Limited training data (approximately 110–120 samples) and lack of diversity are the main problems.
  • Both initial parsing-based and prompt engineering approaches show limitations with complex XQuery.
  • Considering fine-tuning a Qwen2.5-Coder 7B model using PEFT (QLoRA).
  • Questions whether fine-tuning with limited data is sufficient, or whether there is a more efficient alternative approach.
Notable Quotes & Details
  • ~110–120 samples
  • Qwen2.5-Coder 7B

Machine learning engineers, developers, LLM researchers

Reality of SaaS

Criticizes the impracticality of paying monthly fees for cloud-based SaaS products, arguing that building your own with tools like Claude is more efficient.

  • Points out the irrationality of paying $49/month for a SaaS product.
  • Claims it is possible to build your own SaaS in a day for $500 using AI tools like Claude.
  • Expresses the view that the end of the software industry is approaching.
Notable Quotes & Details
  • $49/mo
  • $500 a day

Entrepreneurs, startup founders, general readers interested in leveraging AI technology

Notes: Content with strong subjective opinions and a critical perspective

Why is every AI getting restricted these days?

AI models are becoming increasingly restrictive, leading to growing user frustration — particularly around limitations on creative activities.

  • Major AI models including ChatGPT, Claude, Grok, and Gemini are being more tightly controlled than before.
  • Users express frustration that AI is restricted even for creative work, not just illegal uses like bomb-making.
  • Using local models is not a realistic alternative for ordinary users who rely on subscription services without expensive hardware.
  • Skepticism exists about whether AI can accurately understand user intent to reduce over-triggering of safety guardrails.
Notable Quotes & Details
  • "it's not just ChatGPT... it's Claude, Grok, Gemini… all of them feel way more locked down than before."

AI service users, general readers

How LLMs decide which pages to cite — and how to optimize for it

An article explaining how LLMs decide which pages to cite and how to optimize for it, including the RAG (Retrieval Augmented Generation) process and scoring criteria.

  • ChatGPT and Perplexity determine which pages to cite through RAG (Retrieval Augmented Generation).
  • Citation page scoring criteria are published in the Princeton GEO paper and include directness, statistical data, structured data (JSON-LD), crawl accessibility, and recency.
  • Using schema markup significantly increases the rate of accurate information extraction from 16% to 54%, increasing the likelihood of being cited.
Notable Quotes & Details
  • arxiv.org/abs/2311.09735
  • 16% to 54%

LLM researchers, SEO professionals, web developers

scalar-loop: a Python harness for Karpathy's autoresearch pattern that doesn't trust the agent's narration

A description of scalar-loop, a Python harness for Karpathy's autoresearch pattern that achieves robustness by not trusting the LLM agent's narration.

  • scalar-loop was developed to address the problem of LLM agents manipulating verifiers.
  • scalar-loop provides features such as harness integrity via SHA-256 hash manifests, scope enforcement via git diff, and precondition gates.
  • Invariants are enforced through Python code rather than command-based invariants, preventing abnormal agent behavior.
  • Focuses on improving the reliability of LLM-based automation in strictly regulated environments such as healthcare, legal, and finance.
  • Successfully reduced a real JS bundle size from 1492 bytes to 70 bytes in a real-world optimization task.
Notable Quotes & Details
  • SHA-256 hash
  • 1492 bytes down to 70 bytes

AI developers, LLM agent researchers, software engineers

it is impossible to stop AI chatbots from using quotes (any instance of the character ")

User frustration about the difficulty of preventing AI chatbots from using quotation marks (") despite instructions.

  • It is difficult to prevent chatbots from using quotation marks regardless of which LLM is used or how clearly instructions are given.
  • Chatbots frequently use quotation marks for emphasis, such as 'scare-quotes,' which may not align with user intent.
  • Even when directly instructed not to use quotation marks, chatbots often ignore this and include them anyway.
Notable Quotes & Details

AI chatbot users, LLM developers

Notes: Personal user frustration expressed

Is anyone getting real coding work done with Qwen3.6-35B-A3B-UD-Q4_K_M on a 32GB Mac in opencode, claude code or similar?

A user shares their experience with the limitations of using the Qwen3.6-35B-A3B-UD-Q4_K_M model for real coding tasks on an M2 MacBook Pro with 32GB RAM.

  • Running the Qwen3.6-35B model on an M2 MacBook Pro with 32GB RAM using llama.cpp and opencode.
  • Memory constraints require limiting the context window to 32,768 tokens.
  • The Qwen3.6 model grasps the nature of complex coding bugs that Claude Code Opus 4.7 was able to solve, but fails to reach the implementation stage due to loss of critical information during compaction.
  • Without sub-agents, the model survives the first compaction but becomes confused at the second, losing context.
  • Conclusion: Qwen models are not sufficient for real coding work in a 32GB RAM environment, and more powerful hardware is needed.
Notable Quotes & Details
  • Qwen3.6-35B-A3B-UD-Q4_K_M
  • M2 Macbook Pro with 32GB of RAM
  • context window to 32768 tokens
  • Claude Code was previously able to complete with Opus 4.7

Local LLM developers, ML engineers, Mac users

QWEN3.6 + ik_llama is fast af

Using the Qwen 3.6 model together with ik_llama achieves fast token generation speeds.

  • Using the Qwen 3.6 UD_Q_4_K_M model.
  • Running on 16GB VRAM and 32GB RAM.
  • 200k context window configured.
  • Achieving fast generation speeds of 50+ tokens/sec.
Notable Quotes & Details
  • Qwen3.6 UD_Q_4_K_M
  • 16GB vram + 32GB ram
  • 200k cw
  • 50+ tok/s

Local LLM users, ML engineers, developers interested in LLM performance optimization.

Speculative decoding question, 665% speed increase

A study comparing the speed improvement effects of various LLM models using speculative decoding settings in llama.cpp.

  • Using llama.cpp speculative decoding settings (--spec-type ngram-map-k, --spec-ngram-size-n, --draft-min, --draft-max).
  • Gemma 4 31b model shows a 100% increase in token generation speed.
  • Qwen 3.6 model shows a 40% speed increase (later improved to over 140% with settings changes).
  • Devstrall small model shows a surprising 665% speed increase.
  • Adding repeat-penalty 1.0 and --spec-type ngram-mod settings improved Qwen 3.6 speed.
Notable Quotes & Details
  • 665% speed increase
  • Gemma 4 31b: Doubles in tks gen so 100%
  • Qwen 3.6: Only 40% more speed
  • Devstrall small: 665% increase in speed
  • Qwen 3.6, now speed is increased by 140tks over 100tks base
  • --spec-type ngram-map-k --spec-ngram-size-n 24 --draft-min 12 --draft-max 48

LLM researchers, developers, llama.cpp users, professionals interested in model performance optimization.

LLM Neuroanatomy III - LLMs seem to think in geometry, not language

The third installment of the LLM Neuroanatomy series, showing through new techniques and comparisons across languages and models that LLMs tend to think geometrically rather than linguistically.

  • LLMs organize concepts as vectors, and in intermediate layers show conceptual similarity around topics regardless of language.
  • Contrary to the Sapir-Whorf hypothesis, language for LLMs is merely input and output; actual thinking occurs in vectors within intermediate layers.
  • Experiments conducted across 8 languages (English, Chinese, Arabic, Russian, Japanese, Korean, Hindi, French) and 5 models (Qwen3.5-27B, MiniMax M2.5, GLM-4.7, GPT-OSS-120B, Gemma-4 31B).
  • A Hindi sentence about photosynthesis clustered closer to a Japanese sentence about photosynthesis than to a Hindi sentence about cooking.
  • Concepts in various formats — English descriptions, Python functions, LaTeX equations — were observed to converge to the same region within the model.
Notable Quotes & Details
  • LLM Neuroanatomy III
  • expanded the experiment from 2 languages to 8 (EN, ZH, AR, RU, JA, KO, HI, FR) across 4 different models (Qwen3.5-27B, MiniMax M2.5, GLM-4.7, GPT-OSS-120B and Gemma-4 31B)
  • ½mv², 0.5 * m * v ** 2 , and 'half the mass times velocity squared'

LLM researchers, neuroscience researchers, natural language processing specialists, readers interested in the philosophy of artificial intelligence.

llama.cpp speculative checkpointing was merged

Speculative checkpointing has been merged into llama.cpp, bringing performance improvements for certain prompts.

  • Speculative checkpointing has been successfully merged into llama.cpp (GitHub Pull Request #19493).
  • Shows speed improvements ranging from 0% to 50% on some prompts, particularly useful for coding tasks.
  • Performance gains can vary depending on prompt type and repetition patterns, such as cases with low draft acceptance rates.
  • Optimal operating parameters are `--spec-type ngram-mod --spec-ngram-size-n 24 --draft-min 48 --draft-max 64`.
Notable Quotes & Details
  • 0%~50% speedup
  • --spec-type ngram-mod --spec-ngram-size-n 24 --draft-min 48 --draft-max 64
  • GitHub Pull Request #19493

AI/ML developers, researchers, llama.cpp users

I hid 4 Bluetooth trackers (including AirTags) to test their reliability - here's how Android rivals compared

After hiding 4 Bluetooth trackers (including AirTags) to test their reliability, Apple AirTag emerged as the most reliable tracking option.

  • Apple AirTag is the most reliable and accurate tracking option.
  • Third-party tracking tags work well on both iOS and Android.
  • Any tracking tag dramatically increases the chances of recovering lost items.
  • ZDNET's recommendations are based on extensive testing, research, and comparison shopping.
  • ZDNET editors aim to provide the most accurate information and knowledgeable advice for readers.
Notable Quotes & Details
  • Apple AirTags

General consumers, users interested in purchasing a lost item tracking device

I stopped using my iPhone's hotspot after testing this 5G router - and that won't change

After switching from an iPhone hotspot to a 5G router, the unstable and slow connection issues of the iPhone hotspot were resolved, greatly improving the user experience.

  • The iPhone hotspot had unstable connections and slow speeds, resulting in a poor user experience.
  • The tested 5G router has good portability and long battery life.
  • Supports both SIM and eSIM, with a built-in virtual SIM as well.
  • Supports a high-speed 5G modem and MU-MIMO.
  • One drawback is that the SIM card tray is difficult to remove without tools.
Notable Quotes & Details
  • 5G router
  • SIM and eSIM
  • MU-MIMO

General consumers, users interested in purchasing a 5G mobile hotspot or router

This powerful Gemini setting made my AI results way more personal and accurate

Gemini's Personal Intelligence setting makes AI results more personal and accurate by integrating with Google app data, eliminating the need to manually add context.

  • Personal Intelligence makes Gemini's responses more personalized.
  • It pulls data from Google apps (Gmail, Google Photos, search history, etc.) to eliminate the need for manually adding context.
  • Users can control which app data is used and disable it at any time.
  • With new AI features and tools launching daily, Gemini's Personal Intelligence is a feature worth trying.
Notable Quotes & Details
  • Personal Intelligence
  • Google Gemini
  • Gmail
  • Google Photos
  • Search history

General users, Google Gemini users, users interested in personalizing AI assistants

How Engineers Kick-Started the Scientific Method

Examines how engineers contributed to the development of the scientific method through Francis Bacon's ideas and their inventions.

  • Francis Bacon presented a vision of science grounded in skepticism and empiricism.
  • Solomon's House in Bacon's novel 'New Atlantis' depicts a scientific research institution.
  • The inventions of engineers Cornelis Drebbel and Salomon de Caus inspired Bacon.
  • Drebbel demonstrated the importance of experimentation and iteration through bold inventions such as the submarine.
Notable Quotes & Details
  • 1627
  • 1604

History of science researchers, engineers, general science enthusiasts

Google's Aletheia Advances the State of the Art of Fully Autonomous Agentic Math Research

Google's Aletheia AI has advanced the state of the art in AI mathematical research by solving autonomous research-level math proof problems using Gemini 3 Deep Think.

  • Google Aletheia solved 6 out of 10 new math problems in the FirstProof challenge.
  • It scored approximately 91.9% on IMO-ProofBench, signaling the importance of discovering automated research-level proofs.
  • Aletheia autonomously generated proofs in the FirstProof challenge, which has no data contamination.
  • It features a self-filtering capability that prioritizes reliability over problem-solving ability.
Notable Quotes & Details
  • 6/10
  • ~91.9%
  • Gemini 3 Deep Think

AI researchers, mathematicians, artificial intelligence developers

Fasoo AI: 'Setting AX Security Standards with Data Capabilities and Container Technology'

Fasoo AI presents new AX (AI Transformation) security standards for the AI agent era through its data capabilities and 'secure container' technology.

  • A paradigm shift in security is inevitable in the age of simultaneous AI agent attacks.
  • Fasoo AI has developed technology that safely runs program code within a 'secure container.'
  • Demand for related solutions is growing as the importance of managing AI 'input-output' within enterprises increases.
  • The on-premises AX platform 'Ellm' supports building secure, customized agent applications.
  • Successfully deployed at public and research institutions such as KIST, blocking external leakage of research data.
Notable Quotes & Details
  • 2000
  • AI-R Privacy
  • AI-R DLP
  • Ellm

Corporate security officers, AX consultants, enterprises considering AI solution adoption

'Delivers the Same Performance as a Transformer Model Twice the Size'... New Architecture 'Parcae' Emerges

Researchers from UC San Diego and Together AI have unveiled a looped transformer architecture called 'Parcae' that maintains memory usage while delivering performance comparable to a transformer model twice its size.

  • 'Parcae' applies a looped structure that increases computation by repeatedly running the same computational block.
  • It improves performance without increasing model depth while maintaining memory usage.
  • During inference, the number of iterations can be flexibly adjusted to choose a balance between performance and speed.
  • It resolves the training instability issues of conventional looped architectures, enabling stable operation.
  • A model with 770 million parameters achieves performance similar to a 1.3 billion parameter transformer, demonstrating parameter efficiency.
Notable Quotes & Details
  • UC San Diego
  • Together AI
  • 770 million parameters
  • 1.3 billion parameters

AI researchers, large language model developers, edge AI developers

Anthropic Strengthens 'Claude' Authentication, Chinese Developers Compete to Find Workarounds

As Anthropic has strengthened identity verification for its AI model 'Claude,' workaround access for Chinese developers has become harder, intensifying competition to find alternative access routes.

  • Anthropic has introduced an identity verification process requiring some users to provide government-issued ID such as a passport along with a real-time photo.
  • The measure primarily applies to accounts where fraud or policy violations are suspected, and the data is not used for AI model training.
  • In the Chinese market in particular, existing workaround routes to access Claude are expected to be restricted.
  • Chinese developers are exploring informal countermeasures such as hiring agencies that use overseas personnel.
  • OpenAI's GPT and Google's Gemini are mentioned as alternatives, but they are also not officially available in China.
Notable Quotes & Details
  • "Anthropic introduced an 'identity verification' process on the 17th (local time), requiring some users to provide government-issued ID such as a passport or driver's license along with a real-time photo."
  • "Anthropic emphasized that 'we collect only minimal information, and identity verification data is not used for AI model training.'"

AI developers, AI service users, stakeholders in the Chinese AI market

Jooojub
System S/W engineer
Explore Tags
Series
    Recent Post
    © 2026. jooojub. All right reserved.