There is more to low-latency communication than just squirting data as fast as you can (though that is an important part of it).
One thing people often forget is that the Arduino can only (easily) do one thing at a time. While you're sending your buffer to the PC it can't be sampling and filling your buffer. If you have a buffer which fills up over the period of 1 second and you want to send that buffer to the PC when full, what happens during the subsequent second while you're sending the buffer? The buffer can't be being filled while it's being sent.
Using a normal UART to USB connection as most of the normal Arduino boards (Uno, Mega, etc) use the fastest you can reasonably expect is around 1MBaud. That's about 100kB/s. Less if you use a cheap Chinese clone with a CH340G. Even at 100kB/s your 1kB buffer, if sent completely raw and in binary, would take around 10ms to transfer. But getting the computer to make any sense of that raw data would be hard. So you'd need to wrap it in some form of protocol (up to you to design that...) so you may get half that speed (or worse depending on your protocol design).
And during that time not much else can happen.
For smooth low-latency communication of data it is much more preferable to use "ping-pong" buffers and interrupt- or DMA-driven sampling of your data. That means that one buffer is receiving your data from whatever source, and the other is being sent to the PC. Once the first buffer is filled up they reverse roles. However, the little 8-bit Arduinos aren't really suited for that kind of thing:
- They don't have much RAM to waste on two buffers
- They don't have interrupts with priorities to allow samples to occur while sending
- They don't have DMA to do background transfers of data to/from peripherals
If latency is really that critical you would be better using one of the ARM-based Arduinos, such as the Due, where you can also then take advantage of a high-speed native USB interface which can give much higher throughput than a UART connection. Your choice of USB endpoint can also influence the latency of your design:
- Isochronous - Lowest latency, small packets, not guaranteed to be delivered
- Interrupt - Medium latency, small packets, delivery guaranteed
- Bulk - High latency, large packets, delivery guaranteed
Isochronous is used mainly for audio where you would notice the slightest delay in sound, but wouldn't necessarily notice or worry about a slight glitch in the quality. Interrupt is used for things like HID where you need it to respond to you pressing the keys on your keyboard nice and fast, but you're never sending much data (64 byte packet limit on interrupt transfers). Bulk is used for things like CDC/ACM (Serial over USB) where you may want to tranfer lots of data but wouldn't notice if it's pausing for a few ms every now and then (HS USB has a 512 byte packet limit on bulk transfers, FS USB has a 64 byte limit).
One of the great things with a direct USB connection is that underneath it is all packet-based, so you can use that as your transfer protocol if you know what you're doing. Don't just rely on the CDC/ACM "pipe", but instead make use of the underlying USB protocol to send individual packets which the receiving program can then understand and break apart into the right data without adding any extra protocol overheads.
For more advanced USB programming the Teensy and chipKIT cores have far more sophisticated USB support than the Arduino core, so you may fancy taking a look at their offerings in the way of boards that may suit your needs better.