Article
Stop Syncing and Start Mounting: A Smarter Alternative to aws s3 sync
•
read
If you’re still relying on aws s3 sync to move data between your local directory and Amazon S3, you’re not alone. It’s one of the first tools engineers use with the AWS Ccommand Lline Iinterface—simple and familiar, yet effective for basic file transfers. But as your datasets grow, the limitations of syncing start to show. Repetitive transfers, mounting latency, and fragile CLI workarounds turn what should be a solved problem into a persistent bottleneck.
Why Everyone Starts with aws s3 sync
For most engineers, aws s3 sync
is the go-to command when you need to move files to or from Amazon S3. It's part of the AWS CLI toolkit and feels intuitive, —especially for those familiar with Unix-like sync or recursive copy patterns. The command does exactly what it promises: it syncs a local directory with an S3 bucket or vice versa, copying new and updated files and skipping the rest.
Here’s a simple example:
bash
CopyEdit
aws s3 sync ./data s3://my-bucket/data
This command syncs the contents of your current local directory to the specified bucket. You can also use options to control behavior, —like --delete
to remove files in the destination that no longer exist locally, or --exclude
to skip specific files. In theory, this gives you a lightweight and, reproducible way to manage object storage.
For small projects, creating backups, or one-off transfers, aws s3 sync
gets the job done. It supports basic flags like --exact-timestamps
, --acl bucket-owner-full-control
, and lets you override the command’s default URL or configure output formats via the cli binary format
setting. If your workflow only deals with a handful of files and minimal metadata, syncing may feel like the simplest solution.
But as we’ll see in the next section,—simplicity comes at a cost.
But Syncing Has Serious Limitations
At first glance, aws s3 sync
seems like a powerful tool, —but the deeper your data infrastructure goes, the more cracks start to show. What works for simple file uploads or basic backups doesn’t scale well when you're dealing with constantly changing datasets, large directories, or data-intensive workflows.
Performance Bottlenecks for Large Datasets
Every time you run the sync command, the AWS CLI compares your local directory to the contents of the target bucket. This works fine for a dozen files. —Bbut for thousands (or millions), the overhead compounds quickly. Since the sync operation is batch-based, there's no concept of continuous access or low-latency interaction. It’s always check, then copy.
And because the sync command syncs files by checking for differences in metadata, edge cases involving object metadata
, server side encryption
, or metadata default
settings can introduce inconsistencies or false positives.
“Syncing can quietly become a performance tax—especially when it starts competing with compute jobs for bandwidth or time.”
—Archil Engineering Lead, Harrison Leath
Operational Complexity Grows Quickly
The moment your project outgrows its first bucket, you're juggling flags. You’re specifying file types to exclude (--exclude '*.tmp'
), adding ACLs (--acl bucket-owner-full-control
), managing delete behavior (--delete
), and ensuring the cli binary format setting
is properly configured. That’s before you even get into pagination controls like --no-paginate
or disable cli pager
, which are often required for automation.
In larger teams, someone inevitably runs a sync that deletes the wrong files or fails silently,—especially when using tools that don’t surface command output
clearly. The result: created files vanish, backups overwrite production data, or permissions are mishandled.
Risk of Data Loss or Duplication
One of the most dangerous pitfalls of sync is the --delete
flag. Used improperly, it can remove files in the destination that don’t exist in the source, —even if those files were placed there intentionally by another service or team. There’s also no built-in way to simulate or dry-run this behavior for verification, unless you use flags like --exact-timestamps
or try to script around cli input parameters
.
And since aws s3 sync
doesn't offer transactional guarantees, sync interruptions (due to timeouts, SSL verification errors, or maximum socket connect time
issues) can leave your storage layer in a half-complete state.
Common Sync Pain Points Devs Deal With
Beyond performance and reliability, aws s3 sync
introduces a surprising amount of low-level friction,—especially once your usage goes beyond basic file uploads. These aren’t edge cases. They're the daily reality for developers managing data pipelines and multi-environment deployments.
Here are some common frustrations that come up again and again:
- Lack of visibility before execution: Unless you write wrapper scripts or simulate actions with careful filters, it's hard to preview what the sync operation will do. The
sync command syncs objects
in bulk, often with minimal feedback in thecommand output
. - Fragile filtering and exclusions: Syncing only the files you need can require a patchwork of
-exclude
,-include
, and-exact-timestamps
. Miss one, and you might accidentally uploadone or more files
that shouldn’t be there, —or delete ones that should. - Inconsistent metadata handling: Sync doesn’t always preserve
object metadata
,caching behavior
, orserver side encryption
settings unless explicitly specified using flags like-metadata-directive
or-sse
. This can lead to subtle bugs or downstream failures. - Error-prone CLI ergonomics: Misconfigured defaults like the
cli binary format setting
or automatic behaviors such asdisable automatically prompt
anddisable cli pager
can trip up automation scripts. Many engineers run into vague errors likecancel save preferences unable
or overly verbose logs unless they fine-tuneonly show errors flags
. - Lack of symbolic link support: If your local file structure uses symlinks—a common pattern in ML projects—
aws s3 sync
ignores them. This can silently break expectations about file presence or path dependencies.
If your sync operation involves repeated overrides or excessive debug logging, you're probably using the wrong abstraction for the job.
When Syncing Makes Sense
Despite its drawbacks, aws s3 sync
still has a place in modern cloud workflows, —especially when simplicity and portability matter more than speed or scale. For some use cases, syncing is exactly the right tool.
Here’s where the sync command shines:
- One-time file transfers: Migrating static assets, such as
.txt upload
files or archived logs, from a local directory to a bucket. - Basic backups: Creating point-in-time copies of the current local directory, —especially in CI pipelines or developer environments.
- Cold storage pushes: Sending infrequently accessed files to S3 storage tiers using the AWS Ccommand Lline Iinterface without needing real-time access.
- Air-gapped workflows: When the source and target systems can’t remain continuously connected, sync provides a transactional model that’s easy to reason about.
- Team-wide scripting: When you want a single, portable command that just works across machines with
aws cli installed
.
If you're running a small-scale workload, managing a few thousand files, or just need a way to dump files matching
certain patterns to a specified bucket
, syncing may be all you need.
However, once you step into dynamic workflows,—especially where you're repeatedly syncing files that already exist, overriding metadata, or working with large datasets, —the operational tax adds up quickly.
What Mounting Offers That Syncing Doesn’t
If syncing feels increasingly brittle as your workloads scale, it's because it was never meant to behave like a file system. It's a copy tool—nothing more. Mounting, on the other hand, fundamentally changes how you interact with cloud storage by treating S3 like a live, navigable, and responsive file system.
Here’s what mounting unlocks:
Real-Time Access to S3 Files
Mounted volumes let you access files on-demand, without having to re-transfer, re-sync, or manage version state. No more “sync or recursive copy” loops or scripting around which new and updated files
were changed. Files are simply... there,—just like in a traditional local filesystem
.
This approach eliminates sync drift entirely. If a file exists in the bucket, it’s accessible immediately from your local current directory,
—without triggering a background transfer job or reconciling timestamps.
POSIX Compatibility
Archil’s mounted volumes behave like a standard Linux-compatible file system. That means:
- You can use standard Unix commands (
cat
,cp
,ls
) without worrying about sync flags - Your code and pipelines don’t need to change
- You get proper file semantics, including support for
symbolic links
, permission flags, and predictable behavior with large file trees
This is especially powerful for teams working with legacy tools or batch-processing systems that assume files are on-disk—not in a remote object store.
Zero Data Movement
Unlike sync, which always copies data between endpoints, a mounted volume streams files directly from S3. This means:
- No duplicate storage usage from
created files
in both source and destination - No re-uploading the same files because of minor metadata changes
- No confusion around
delete output
,sync operation
, or whether thespecified command
used the correct flags
The mount behaves like a window into the bucket—, not a mirror of it.
“Once you stop thinking in terms of file transfer and start thinking in terms of file presence, everything gets faster and simpler.”
— Archil Founder, Hunter Leath
Archil in Action: Mounting S3 the Right Way
Archil replaces the repetitive, error-prone nature of aws s3 sync
with a single, declarative step: mount your S3 bucket as if it were a local file system. No data gets copied unless it’s accessed. No scripting gymnastics. No sync loops.
Here’s what it looks like in practice:
bash
CopyEdit
archil mount s3://my-bucket ~/data
This command creates a live, POSIX-compliant view of your bucket at ~/data
. Files appear in your local directory structure instantly. There’s no sync command
constantly rechecking for new and updated files
, no concern about cli binary format setting
, and no risk of conflicting states caused by multiple tools trying to “fix” the same folder.
Archil also respects S3's native configurations, —whether you’re using access point
aliases, server side encryption
, or bucket owner full control
ACLs. Under the hood, it uses a custom caching and streaming protocol to deliver sub-second access to S3 objects, even in large-scale environments.
And because Archil is built for Unix-style environments, it works seamlessly with common shell tools, data pipelines, and AI/ML workflows, —without requiring you to rewire everything just to avoid syncing.
Better Performance, Lower Costs
Every time you run aws s3 sync
, you’re not just moving data—you’re paying for it. Syncing large directories leads to increased storage usage, more API calls, and duplicated data transfer, especially when working with S3 across multiple environments. Even small mistakes,—like resyncing files due to changed object metadata
or forgotten flags,—can rack up real costs.
Archil avoids this entirely. By mounting the bucket as a live file system, it fetches only what you need, when you need it, —without copying files unnecessarily. There’s no storage duplication, no wasted transfer bandwidth, and no extra compute spent scanning the current local directory
for differences.
No More Paying to Re-Sync the Same Files
The savings go beyond just dollars:
- You avoid delays caused by excessive
sync command syncs
and S3 rate limits - You don’t need to script around
maximum socket connect time
or retry logic - You reduce the risk of incurring charges from re-uploading the same
txt upload
or deleting files with the wrong flag
For teams working with large models, real-time analytics, or high-churn datasets, sync becomes an unnecessary tax. Mounting replaces all of that with something faster—and cheaper.
Better Performance, Lower Costs
Every time you run aws s3 sync
, you’re not just moving data—you’re paying for it. Syncing large directories leads to increased storage usage, more API calls, and duplicated data transfer, especially when working with S3 across multiple environments. Even small mistakes,—like resyncing files due to changed object metadata
or forgotten flags—can rack up real costs.
Archil avoids this entirely. By mounting the bucket as a live file system, it fetches only what you need, when you need it,—without copying files unnecessarily. There’s no storage duplication, no wasted transfer bandwidth, and no extra compute spent scanning the current local directory
for differences.
The savings go beyond just dollars:
- You avoid delays caused by excessive
sync command syncs
and S3 rate limits - You don’t need to script around
maximum socket connect time
or retry logic - You reduce the risk of incurring charges from re-uploading the same
txt upload
or deleting files with the wrong flag
“One of our customers reduced their monthly S3 I/O costs by 78% just by swapping out their daily sync jobs for a single Archil mount.”
— Archil Founder, Hunter Leath
For teams working with large models, real-time analytics, or high-churn datasets, sync becomes an unnecessary tax. Mounting replaces all of that with something faster—and cheaper.
When to Still Use aws s3 sync
Despite its limitations, aws s3 sync
isn’t going away anytime soon. —and it It still makes sense in a few specific scenarios.
The Right Tool for the Right Moment
Here’s when sync might be the better option:
- One-time data pushes from a
source directory
during setup or migration - Backup jobs that run overnight and require simple diffs of
uploaded files
- Air-gapped workflows where mounting isn’t possible or supported
- Static archiving of folders using
txt delete
,exclude files
, or simple inclusion patterns
If your use case doesn’t require low latency or live access.—aAnd if you're comfortable parsing through the aws cli user guide
and fine-tuning configured cli binary format
flags—sync can still serve you well.
In these cases, the following sync command might still be all you need:
bash
CopyEdit
aws s3 sync ./backup s3://my-archive-bucket --delete --exclude "*.tmp"
But It Doesn’t Scale
The more often you run syncs, the more you're exposed to fragility: inconsistent metadata handling, unpredictable file states, and CLI flags that break silently when a default value
or only valid value
is misapplied. Sync wasn't built to deliver fast iteration, real-time access, or high availability.
For teams dealing with fast-moving data or pipelines that need to function properly
at scale, mounting simply removes the guesswork.
Final Thoughts: Don’t Sync When You Can Mount
The aws s3 sync
command served its purpose, —it gave giving developers a simple way to push files to S3 from a local machine. But cloud workloads have evolved. Teams now need real-time file access, predictable performance, and fewer brittle scripts. Mounting S3 directly offers a better foundation.
With Archil, you avoid the complexity, fragility, and cost of traditional sync-based workflows. Here'’s what you no longer have to worry about:
- Overriding the command'’s default URL or tweaking the
aws configure command
- Troubleshooting unclear
file transfer progress
or broken automation flows - Managing
default format
,credential file
,access key id
, orsecret access key
- Handling permissions and ownership via
account's canonical id
andgranted permissions
- Remembering to
verify SSL certificates
ordisable automatic pagination
in CI/CD pipelines - Accidentally breaking encryption settings like
specifies server side encryption
orcustomer provided encryption key
- Losing or duplicating metadata from misused
metadata directive value
orspecifies caching behavior
- Breaking pipelines due to mishandled symlinks (
support symbolic links
) orunix like quotation rules
- Introducing noise by forgetting to
store essential cookies
,collect anonymous statistics
, orsave cookie preferences
- Digging through the
AWS Console footer
to debug a silent failure in your sync job
“We built Archil so you can interact with S3 like it’s part of your machine—not a remote object you have to constantly chase down.”
— Archil Founder, Hunter Leath
The takeaway: if your infrastructure depends on syncing, it's probably time to switch to something built for scale, speed, and sanity.
Try It Yourself
Ready to stop syncing and start mounting?
Archil lets you turn any S3 bucket into a high-performance, POSIX-compliant file system,—without changing your application code, cloud environment, or folder structure.
- Get instant access to your S3 data with a single mount command
- Reduce AWS costs by eliminating unnecessary sync jobs
- Streamline dev, ML, and analytics workflows with real-time, on-demand access
👉 **Get started with Archil now,** and make your data infrastructure simpler, faster, and future-proof.
Authors