Skip to content

[MacOS] search that works with weird UTF-8-MAC encoding of special characters (e.g. umlauts) #3622

@fipski

Description

@fipski

yazi --debug output

Yazi
    Version: 26.1.22 (Homebrew 2026-01-22)
    Debug  : false
    Triple : aarch64-apple-darwin (macos-aarch64)
    Rustc  : 1.92.0 (ded5c06c 2025-12-08)

Ya
    Version: 26.1.22 (Homebrew 2026-01-22)

Config
    Init             : /Users/phil/.config/yazi/init.lua (1543 chars)
    Yazi             : /Users/phil/.config/yazi/yazi.toml (7419 chars)
    Keymap           : /Users/phil/.config/yazi/keymap.toml (21572 chars)
    Theme            : /Users/phil/.config/yazi/theme.toml (76 chars)
    VFS              : /Users/phil/.config/yazi/vfs.toml (No such file or directory (os error 2))
    Package          : /Users/phil/.config/yazi/package.toml (529 chars)
    Dark/light flavor: "tokyonight-night" / ""

Emulator
    TERM                : Some("alacritty")
    TERM_PROGRAM        : Some("tmux")
    TERM_PROGRAM_VERSION: Some("3.6a")
    Brand.from_env      : Some(WezTerm)
    Emulator.detect     : Emulator { kind: Left(WezTerm), version: "WezTerm 20240203-110809-5046fc22", light: false, csi_16t: (9, 17), force_16t: false }

Adapter
    Adapter.matches    : Iip
    Dimension.available: Dimension { rows: 37, columns: 131, width: 1179, height: 629 }

Desktop
    XDG_SESSION_TYPE           : None
    WAYLAND_DISPLAY            : None
    DISPLAY                    : None
    SWAYSOCK                   : None
    HYPRLAND_INSTANCE_SIGNATURE: None
    WAYFIRE_SOCKET             : None

SSH
    shared.in_ssh_connection: false

WSL
    WSL: false

Variables
    SHELL              : Some("/bin/zsh")
    EDITOR             : Some("/opt/homebrew/bin/nvim")
    VISUAL             : None
    YAZI_FILE_ONE      : None
    YAZI_CONFIG_HOME   : None
    YAZI_ZOXIDE_OPTS   : None
    FZF_DEFAULT_OPTS   : None
    FZF_DEFAULT_COMMAND: None

Text Opener
    default     : Some(OpenerRule { run: "${EDITOR:-vi} %s", block: true, orphan: false, desc: "$EDITOR", for: None, spread: true })
    block-create: Some(OpenerRule { run: "${EDITOR:-vi} %s", block: true, orphan: false, desc: "$EDITOR", for: None, spread: true })
    block-rename: Some(OpenerRule { run: "${EDITOR:-vi} %s", block: true, orphan: false, desc: "$EDITOR", for: None, spread: true })

Multiplexers
    TMUX               : true
    tmux version       : tmux 3.6a
    tmux build flags   : enable-sixel=Supported
    ZELLIJ_SESSION_NAME: None
    Zellij version     : No such file or directory (os error 2)

Dependencies
    file          : 5.41
    ueberzugpp    : No such file or directory (os error 2)
    ffmpeg/ffprobe: 8.0.1 / 8.0.1
    pdftoppm      : 26.01.0
    magick        : 7.1.2-13
    fzf           : 0.67.0
    fd/fdfind     : 10.3.0 / No such file or directory (os error 2)
    rg            : 15.1.0
    chafa         : No such file or directory (os error 2)
    zoxide        : 0.9.8
    7zz/7z        : 25.01 / No such file or directory (os error 2)
    resvg         : No such file or directory (os error 2)
    jq            : 1.8.1

Clipboard
    wl-copy/paste: No such file or directory (os error 2) / No such file or directory (os error 2)
    xclip        : No such file or directory (os error 2)
    xsel         : No such file or directory (os error 2)

Routine
    `file -bL --mime-type`: text/plain


See https://yazi-rs.github.io/docs/plugins/overview#debugging on how to enable logging or debug runtime errors.

Please describe the problem you're trying to solve

I'm new to mac os and still finding out weird quriks. One is, that all my files got synced with weird encoding.
Apparently this is called decomposed normal form.

An ä gets 'a' plslus a unicode char for diacritical dots
ä -> a + U+0308

Let's say we are looking for "mäc-ümlaut", fd does not find it with "mäc".
What does work is find . | iconv -f UTF-8-MAC -t UTF-8 | grep mäc

I made a very limited poc as a lua plugin. Of course it is still possible to write files in regular unicode, so this would need another search pattern.
I'm not sure how other file managers handle this, I never noticed this issue.

https://github.com/fipski/mac-umlaut.yazi

There is a discussion for fd, without a really good conclusion.
sharkdp/fd#1535

Some insights from apple
https://developer.apple.com/library/archive/qa/qa1235/_index.html
schei? encoding!

Would you be willing to contribute this feature?

  • Yes, I'll give it a shot

Describe the solution you'd like

In conclusion I don't know how to properly solve this. A file name could be either way encoded, search for both strings?
It would probably be best to replace all umlauts in the input with regexes.

Additional context

No response

Checklist

  • I have searched the existing issues/discussions
  • The latest nightly build doesn't already have this feature

Metadata

Metadata

Assignees

No one assigned

    Labels

    featureNew feature request

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions