Re: [PERFORM] Question about network bandwidth usage between PostgreSQL’s client and server

Lists: pgsql-performance
From: Kelphet Xiong <kelphet(at)gmail(dot)com>
To: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Question about network bandwidth usage between PostgreSQL’s client and server
Date: 2013-04-24 23:56:24
Message-ID: CANnt-4Zrm=V1S5ViYnieUr1GVXROZ_BJsRAjcqKJomSEjq7zpA@mail.gmail.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

Hi all,

I have a general question about network traffic between
PostgreSQL’s client and server:
what determines the network bandwidth usage or data transferring rate
between a client and a server
when network bandwidth is enough?

For example, I ran queries on two tables, lineitem and partsupp in
TPCH benchmark
(with scaling factor 5). Lineitem table is 4630 MB and has 30000000 rows.
Partsupp table is 693 MB and has 4000000 rows. Their definitions are
shown below:

Table "public.lineitem"

Column | Type | Modifiers

-----------------+-----------------------+-----------

l_orderkey | integer | not null

l_partkey | integer | not null

l_suppkey | integer | not null

l_linenumber | integer | not null

l_quantity | numeric(15,2) | not null

l_extendedprice | numeric(15,2) | not null

l_discount | numeric(15,2) | not null

l_tax | numeric(15,2) | not null

l_returnflag | character(1) | not null

l_linestatus | character(1) | not null

l_shipdate | date | not null

l_commitdate | date | not null

l_receiptdate | date | not null

l_shipinstruct | character(25) | not null

l_shipmode | character(10) | not null

l_comment | character varying(44) | not null

Table "public.partsupp"

Column | Type | Modifiers

---------------+------------------------+-----------

ps_partkey | integer | not null

ps_suppkey | integer | not null

ps_availqty | integer | not null

ps_supplycost | numeric(15,2) | not null

ps_comment | character varying(199) | not null

For different queries, I observe different data transferring rate
between a client and a server (client and server are on different
physical machines)
using tcpdump as shown below:

Query 1: select * from lineitem;

Seq Scan on lineitem (cost=0.00..892562.86 rows=29998686 width=125)

Average network usage: *42.5MB/s*

Query 2: select * from partsupp;

Seq Scan on partsupp (cost=0.00..128685.81 rows=4001181 width=146)

Average network usage: *95.9MB/s*

Query 3: select * from lineitem, partsupp where l_partkey=ps_partkey;

Hash Join (cost=178700.57..17307550.15 rows=116194700 width=271)

Hash Cond: (lineitem.l_partkey = partsupp.ps_partkey)

-> Seq Scan on lineitem (cost=0.00..892562.86 rows=29998686 width=125)

-> Hash (cost=128685.81..128685.81 rows=4001181 width=146)

-> Seq Scan on partsupp (cost=0.00..128685.81 rows=4001181 width=146)

Average network usage: *53.1MB/s*

In all the experiments, the lineitem and partsupp tables reside in memory
because there is no io activities observed from iotop.
Since there is enough network bandwidth (1Gb/s or 128MB/s) between
client and server,
I would like to know what determines the data transferring rate or the
network bandwidth usage
between a client and a server when network bandwidth is enough.
For example, given that the size of each tuple of lineitem table is
88% of that of partsupp,
why the average network usage for sequential scan of lineitem table is only 50%
that of partsupp table? And why the average network usage of their
join is higher
than that of sequential scan of lineitem but lower than that of
sequential scan of partsupp table?

Thanks!

Kelphet Xiong


From: Heikki Linnakangas <hlinnakangas(at)vmware(dot)com>
To: Kelphet Xiong <kelphet(at)gmail(dot)com>
Cc: "pgsql-performance(at)postgresql(dot)org" <pgsql-performance(at)postgresql(dot)org>
Subject: Re: [PERFORM] Question about network bandwidth usage between PostgreSQL’s client and server
Date: 2013-04-25 11:26:00
Message-ID: 517912C8.2010801@vmware.com
Views: Raw Message | Whole Thread | Download mbox | Resend email
Lists: pgsql-performance

On 25.04.2013 02:56, Kelphet Xiong wrote:
> In all the experiments, the lineitem and partsupp tables reside in memory
> because there is no io activities observed from iotop.
> Since there is enough network bandwidth (1Gb/s or 128MB/s) between
> client and server,
> I would like to know what determines the data transferring rate or the
> network bandwidth usage
> between a client and a server when network bandwidth is enough.

Since there's enough network bandwidth available, the bottleneck is
elsewhere. I don't know what it is in your example - maybe it's the I/O
capacity, or CPU required to process the result in the server before
it's sent over the network. It could also be in the client, on how fast
it can process the results coming from the server.

I'd suggest running 'top' on the server while the query is executed, and
keeping an eye on the CPU usage. If it's pegged at 100%, the bottleneck
is the server's CPU.

> For example, given that the size of each tuple of lineitem table is
> 88% of that of partsupp,
> why the average network usage for sequential scan of lineitem table is only 50%
> that of partsupp table? And why the average network usage of their
> join is higher
> than that of sequential scan of lineitem but lower than that of
> sequential scan of partsupp table?

Here's a wild guess: the query on lineitem is bottlenecked by CPU usage
in the server. A lot of CPU could be spent on converting the date fields
from on-disk format to the text representation that's sent over the
network; I've seen that conversion use up a lot of CPU time on some test
workloads. Try leaving out the date columns from the query to test that
theory.

If that's the bottleneck, you could try fetching the result in binary
format, that should consume less CPU in the server. You didn't mention
what client library you're using, but e.g with libpq, see the manual on
PQexecParams on how to set the result format.

- Heikki