If you have a 100 Gigabit-per-second (Gbps) network link, but your S3 uploads max out at a fraction of that speed, here is the harsh truth: S3 is not the bottleneck. The service is engineered for massive scale. The real limits are almost always on your side of the connection. Here’s a deep dive into the four critical factors determining your S3 upload speed and how to tune your environment to achieve near line-rate speeds (up to 99.4 Gbps)
Contents
- S3 Can’t Sustain High Speeds
- Master Parallel Multipart Uploads
- The Hardware Trap: Beware of Firewall Limitations
- Client-Side Tuning Matters at Scale
S3 Can’t Sustain High Speeds
A common misconception is that S3 itself limits the throughput of a single stream. The reality is that with a proper configuration, S3 can easily sustain near line-rate speeds. We’ve seen setups push data at rates of up to 9.4 Gbps over a dedicated 10 Gbps AWS Direct Connect link.
This performance, however, is not achieved by accident. It requires leveraging S3’s core architecture.
Master Parallel Multipart Uploads
The single biggest reason most high-speed uploads fail to perform is trying to use a single upload stream.
A single TCP connection—even on a 10 Gbps pipe—simply won’t saturate your available bandwidth. The key to full utilization lies in parallelization.
Strategy | Description | Impact on Throughput |
Break Large Files into Chunks | Use the S3 Multipart Upload API to divide large files (e.g., >100 MB) into smaller parts. | Essential for enabling parallelization of a single file. |
Run Concurrent Streams | Simultaneously initiate multiple multipart uploads and transfers. | Critical to utilizing high-bandwidth links. We recommend starting with 16 or more concurrent streams to fully saturate a 10 Gbps connection. |
Key Recommendation: Always use tools like the AWS CLI, SDKs, or third-party transfer utilities that are specifically designed to perform parallel multipart uploads.
The Hardware Trap: Beware of Firewall Limitations
You bought a “10-gig” firewall, so you should get 10 Gbps of throughput, right? Not necessarily.
Many enterprise-grade firewalls and network devices are designed to inspect and manage a high number of concurrent, smaller streams, but often cannot handle the sheer packet volume of a single, sustained 10 Gbps stream. This is often due to internal packet processing or deep inspection architecture.
True high-throughput data transfer requires a clean network path with hardware that is specifically aligned to handle your maximum network speed without becoming the bottleneck itself. Consult your network vendor to confirm the actual sustained throughput limits for large, continuous data flows.
Client-Side Tuning Matters at Scale
For data transfers at large scales (moving Terabytes of data or millions of files), the bottleneck often shifts back to the machine initiating the transfer—your client-side system.
Achieving optimal throughput requires extensive system optimization:
- Storage Performance: Your local storage (HDD/SSD array) must be fast enough to read data off the disk at the same rate the network is sending it (e.g., 1 GB/s for a 10 Gbps link).
- Kernel Tuning: The operating system’s network stack settings (e.g., TCP window sizes, buffer limits) need to be adjusted to accommodate high-latency, high-bandwidth transfers.
- Dedicated Data Transfer Nodes (DTNs): Enterprises often dedicate powerful servers or virtual machines solely for data transfer. These DTNs are heavily optimized for I/O and networking to eliminate all local bottlenecks.
By tuning your client-side system, you ensure that the data is ready to be fed to the concurrent S3 streams as quickly as your network can handle itBy implementing these strategies, you can ensure your cloud data migration is limited only by the laws of physics, not a simple configuration oversight.