p_i_r_a_n_y_a: (Default)
[personal profile] p_i_r_a_n_y_a
 Этой осенью я – простужена, 
Только мне не поможет врач:
Я болею тобой, ты нужен мне…
Не могу без тебя, хоть плачь…


Ящик, СПб

Oct. 20th, 2017 09:04 pm
olga_arefieva: (Default)
[personal profile] olga_arefieva
Питер не отпускает. Что-то во мне там такое происходит, что хочется еще и еще. Творчество там прямо ломится и стучится в голову.

Клуб ЯЩИК.
«Ольга Арефьева выступит с сольным акустическим концертом в петербургском Ящике. Камерная обстановка творческой встречи предполагает тесный контакт, честные ответы на вопросы, улыбку и шутки, премьеры новых творений и редкие песни. А уютное пространство клуба Ящик, пожалуй, одно из самых подходящих для такой встречи мест.
Автор и зритель встречаются лицом к лицу в маленьком пространстве. Ольга впервые попробовала выступить в одиночку несколько лет назад и, надо отметить, весьма успешно. Теперь можно с уверенностью сказать – формат сольных концертов-творческих вечеров вполне жизнеспособен!»
Начало в 20:00 | Клуб ЯЩИК | Лиговский пр., 50, корп. 13 | 964-96-37
Билеты от 800 рублей в кассе клуба и по ссылке.

[syndicated profile] bleeping_computer_feed

Posted by Lawrence Abrams

This week we had our fair share of smaller variants being distributed or created, but the big news was by far the release of Magniber and the use of the Hermes ransomware as a cover to steal money from a Taiwan bank.  With the release of Magniber we also see the downward spiral of Cerber,. [...]

Train Tickets.

Oct. 20th, 2017 10:20 pm
olindom: (Default)
[personal profile] olindom
 3 Greeks and 3 Turks are travelling by train to a conference. At the station, the 3 Turks each buy tickets and watch as the 3 Greeks buy only a  single ticket. "How are 3 people going to travel on only one ticket?" asks one Turk. "Watch and you'll see," answers one Greek. They all board the train. The Turks take their respective seats but all three Greeks cram into a restroom and close the door behind them. Shortly after the train has departed, the conductor comes around collecting tickets. He knocks on the restroom door and says, " Ticket, please." The door opens just a crack and a single arm emerges with a ticket in hand. The conductor takes it and moves on. The Turks saw this and agreed it was quite a clever idea. So after the conference, the Turks decide to copy the Greeks on the return trip and save some money (being clever with money, and all that).
When they get to the station, they buy a single ticket for the return trip. To their astonishment, the Greeks don't buy a ticket at all. "How are you going to travel without a ticket?" says one perplexed Turk. "Watch and you'll see," answers a Greek. When they board the train the 3 Turks cram into a restroom and the 3 Greeks cram into another one nearby. The train departs. Shortly afterward, one of  the Greeks leaves his restroom and walks over to the restroom where the Turks are hiding. He knocks on the  door and says, "Ticket, please." 

source: http://www.jokes4us.com/winnersjokes/3greeksand3turksjoke.html

The Frightening Friday Five

Oct. 20th, 2017 02:13 pm
jesse_the_k: Perfectly circlular white brain-like fungus growing on oak tree (Default)
[personal profile] jesse_the_k

What book frightened you as a young person?
None I can remember.

If you had to become a ‘living book’ (i.e. able to recite the contents of a book cover to cover upon request – reference Fahrenheit 451), what book would it be?
To Be of Use by Marge Piercy, poetry

What movie or TV show scared you as a kid?
The Outer Limits. I’d watch with my older sister and she told me when it was safe to lower my hands from my eyes.

What movie (scary or otherwise) will you never ever watch?
Silence of the lambs et seq

Do you have any phobias?
Centipedes, millipedes, and other Myriapodae make me recoil and squeal a little.

[syndicated profile] planetposgresql_feed

Postgres under the covers in simplified terms is one giant append only log. When you insert a new record that gets appended, but the same happens for deletes and updates. For a delete a record is just flagged as invisible, it’s not actually removed from disk immediately. For updates the old record is flagged as invisible and a new record is written. What then later happens is Postgres comes through and looks at all records that are invisible and actually frees up the disk storage. This process is known as vacuum.

There are a couple of key levels to VACUUM within Postgres:

  • VACUUM ANALYZE - This one is commonly run when you’ve recently loaded data into your database and want Postgres to update it’s statistics about the data
  • VACUUM FULL - This will take a lock during the operation, but will scan the full table and reclaim all the space it can from dead tuples.

Postgres also has a mechanism for regularly freeing up unused space known as autovacuum. Autovacuum we’ve previously discussed in an earlier post. We do highly recommend you tune autovacuum appropriately to prevent your bloat ballooning. But a good question remains: Do I have bloat on my Postgres database?

This query is useful to periodically monitor your bloat. How often you should check is dependent on your application, but every month or two is a reasonable timeframe.

WITH constants AS (
    -- define some constants for sizes of things
    -- for reference down the query and easy maintenance
    SELECT current_setting('block_size')::numeric AS bs, 23 AS hdr, 8 AS ma
no_stats AS (
    -- screen out table who have attributes
    -- which dont have stats, such as JSON
    SELECT table_schema, table_name, 
        n_live_tup::numeric as est_rows,
        pg_table_size(relid)::numeric as table_size
    FROM information_schema.columns
        JOIN pg_stat_user_tables as psut
           ON table_schema = psut.schemaname
           AND table_name = psut.relname
        LEFT OUTER JOIN pg_stats
        ON table_schema = pg_stats.schemaname
            AND table_name = pg_stats.tablename
            AND column_name = attname 
    WHERE attname IS NULL
        AND table_schema NOT IN ('pg_catalog', 'information_schema')
    GROUP BY table_schema, table_name, relid, n_live_tup
null_headers AS (
    -- calculate null header sizes
    -- omitting tables which dont have complete stats
    -- and attributes which aren't visible
        hdr+1+(sum(case when null_frac <> 0 THEN 1 else 0 END)/8) as nullhdr,
        SUM((1-null_frac)*avg_width) as datawidth,
        MAX(null_frac) as maxfracsum,
        hdr, ma, bs
    FROM pg_stats CROSS JOIN constants
        LEFT OUTER JOIN no_stats
            ON schemaname = no_stats.table_schema
            AND tablename = no_stats.table_name
    WHERE schemaname NOT IN ('pg_catalog', 'information_schema')
        AND no_stats.table_name IS NULL
            FROM information_schema.columns
                WHERE schemaname = columns.table_schema
                    AND tablename = columns.table_name )
    GROUP BY schemaname, tablename, hdr, ma, bs
data_headers AS (
    -- estimate header and row size
        ma, bs, hdr, schemaname, tablename,
        (datawidth+(hdr+ma-(case when hdr%ma=0 THEN ma ELSE hdr%ma END)))::numeric AS datahdr,
        (maxfracsum*(nullhdr+ma-(case when nullhdr%ma=0 THEN ma ELSE nullhdr%ma END))) AS nullhdr2
    FROM null_headers
table_estimates AS (
    -- make estimates of how large the table should be
    -- based on row and page size
    SELECT schemaname, tablename, bs,
        reltuples::numeric as est_rows, relpages * bs as table_bytes,
            (datahdr + nullhdr2 + 4 + ma -
                (CASE WHEN datahdr%ma=0
                    THEN ma ELSE datahdr%ma END)
                )/(bs-20))) * bs AS expected_bytes,
    FROM data_headers
        JOIN pg_class ON tablename = relname
        JOIN pg_namespace ON relnamespace = pg_namespace.oid
            AND schemaname = nspname
    WHERE pg_class.relkind = 'r'
estimates_with_toast AS (
    -- add in estimated TOAST table sizes
    -- estimate based on 4 toast tuples per page because we dont have 
    -- anything better.  also append the no_data tables
    SELECT schemaname, tablename, 
        TRUE as can_estimate,
        table_bytes + ( coalesce(toast.relpages, 0) * bs ) as table_bytes,
        expected_bytes + ( ceil( coalesce(toast.reltuples, 0) / 4 ) * bs ) as expected_bytes
    FROM table_estimates LEFT OUTER JOIN pg_class as toast
        ON table_estimates.reltoastrelid = toast.oid
            AND toast.relkind = 't'
table_estimates_plus AS (
-- add some extra metadata to the table data
-- and calculations to be reused
-- including whether we cant estimate it
-- or whether we think it might be compressed
    SELECT current_database() as databasename,
            schemaname, tablename, can_estimate, 
            CASE WHEN table_bytes > 0
                THEN table_bytes::NUMERIC
                ELSE NULL::NUMERIC END
                AS table_bytes,
            CASE WHEN expected_bytes > 0 
                THEN expected_bytes::NUMERIC
                ELSE NULL::NUMERIC END
                    AS expected_bytes,
            CASE WHEN expected_bytes > 0 AND table_bytes > 0
                AND expected_bytes <= table_bytes
                THEN (table_bytes - expected_bytes)::NUMERIC
                ELSE 0::NUMERIC END AS bloat_bytes
    FROM estimates_with_toast
    SELECT current_database() as databasename, 
        table_schema, table_name, FALSE, 
        est_rows, table_size,
    FROM no_stats
bloat_data AS (
    -- do final math calculations and formatting
    select current_database() as databasename,
        schemaname, tablename, can_estimate, 
        table_bytes, round(table_bytes/(1024^2)::NUMERIC,3) as table_mb,
        expected_bytes, round(expected_bytes/(1024^2)::NUMERIC,3) as expected_mb,
        round(bloat_bytes*100/table_bytes) as pct_bloat,
        round(bloat_bytes/(1024::NUMERIC^2),2) as mb_bloat,
        table_bytes, expected_bytes, est_rows
    FROM table_estimates_plus
-- filter output for bloated tables
SELECT databasename, schemaname, tablename,
    pct_bloat, mb_bloat,
FROM bloat_data
-- this where clause defines which tables actually appear
-- in the bloat chart
-- example below filters for tables which are either 50%
-- bloated and more than 20mb in size, or more than 25%
-- bloated and more than 1GB in size
WHERE ( pct_bloat >= 50 AND mb_bloat >= 20 )
    OR ( pct_bloat >= 25 AND mb_bloat >= 1000 )
ORDER BY pct_bloat DESC;

Credit to PGExperts for an updated version of the commonly used bloat query

The above query will give you something that looks like this:

 databasename  | schemaname |   tablename   | can_estimate | est_rows | pct_bloat | mb_bloat | table_mb
 djsdcv27d8mqq | public     | github_events | t            |  1820360 |        29 |   983.65 | 3397.547
 djsdcv27d8mqq | public     | github_users  | t            |   952920 |        28 |   475.80 | 1710.852
(2 rows)

The amount of bloat you have is relative to the size of your database. If you’re running a smaller database with only 20 GB of data total this amount may be noticable and useful to clean up up, but you also may be able to skate by for a while. If you are at several hundred GB of data stored within have bloat that is 100 GB or more and a significant percentage of your database in that case it can be really worth cleaning up.

If you do find yourself with significant bloat you have a few options:

  • Consider a full vacuum (if your application can handle the lock while it runs)
  • Investigate your autovacuum settings for a more long term healthy setup
  • If it’s a continual problem could consider sharding, which allows Postgres to operate more concurrently over smaller tables at a time.
[syndicated profile] planetposgresql_feed

Fans of Rapid Application Development (RAD!) are probably already familiar with Docker, but what does that have to do with Postgres? Database-driven applications are a dime a dozen these days, and a good RAD environment is something of a Holy Grail to coders and QA departments alike. Docker lets us spin up a Postgres instance in seconds, and discard it with a clean conscience.

There have even been some noises within certain circles about using it in a production context. Can we do something like that responsibly? Docker containers are practically guaranteed to be ephemeral, while production data most decidedly isn’t. The answer to this is ultimately complex, and something we’ll be exploring over the next several weeks.

Let’s get started.

Let There Be Light

Since Docker itself is a commercial product, getting the latest community packages (Docker CE) means making a free account on their website and downloading the packages for your preferred platform. For slightly older versions, RHEL and CentOS variants can install EPEL, where and the docker-io package resides. Debian variants such as Ubuntu can simply install docker.io with apt.

Once we have Docker itself, we can actually start playing with it right away. We should start by grabbing the latest version of Postgres from the official Docker Hub image:

$> docker pull postgres:latest

latest: Pulling from library/postgres
3e17c6eae66c: Pull complete 
3d89ae9a47e4: Pull complete 
f7726fda7efe: Pull complete 
d1838499bd8f: Pull complete 
a5ec5aa60735: Pull complete 
1571d7170291: Pull complete 
0d6e41e13732: Pull complete 
787e3c45a9bb: Pull complete 
7b234cf83b22: Pull complete 
3a8ad2440289: Pull complete 
9351993374c0: Pull complete 
a8f3575e09a1: Pull complete 
a4c4b2ff0c3a: Pull complete 
Digest: sha256:73a1c4e98fb961bb4a5c55ad6428470a3303bd3966abc442fe937814f6bbc002
Status: Downloaded newer image for postgres:latest

$> docker images

REPOSITORY    TAG         IMAGE ID          CREATED           SIZE
postgres      latest      b106d5a0dc75      22 hours ago      287.2 MB

Since Docker images can build on top of each other, we can see that Postgres is comprised of a long chain of dependencies. Despite that, the image itself isn’t very large; 287MB for an entire operating system and database server is pretty good. There’s even smaller versions if we are willing to forgo certain UNIX tools. The ‘postgres:10-alpine’ image for instance, is just under 40MB.

Runnin’ with the Devil

Just having an image is pretty pointless, of course. Let’s start a container and see what’s inside. This is done with the docker run command. We’ll pass it a few parameters to make things easier to use later as well. Setting the name, for example, lets us manipulate the container later without having to use an ugly MD5 hash.

Since Docker containers run on a separate network layer, we want to publish the default Postgres 5432 port to 5555 to prevent conflicts with existing applications on the host system. And finally, we want to detach from the terminal so we can use it for other things, and that’s also probably how we’ll be using the container in any reasonable configuration.

$> docker run --name pg-test -p 5555:5432 -d postgres


$> docker ps

CONTAINER ID        IMAGE               COMMAND                  CREATED             STATUS              PORTS                    NAMES
69f21490ff41        postgres            "docker-entrypoint.sh"   40 seconds ago      Up 39 seconds>5432/tcp   pg-test

$> psql -p 5555 -U postgres -h localhost

Timing is on.
Pager usage is off.
psql (10.0)
Type "help" for help.

postgres=# SELECT version();
 PostgreSQL 10.0 on x86_64-pc-linux-gnu, compiled by gcc (Debian 6.3.0-18) 6.3.0 20170516, 64-bit
(1 row)

Run and done. We can connect to the container on port 5555 as expected, we have the latest Postgres 10, and Postgres confirms its full version and build environment. Let’s try to make some tables.

Playing God

We’ve used the sensor_log table a lot in the past. A good reason for that is due to the sheer size of the data we normally insert. There’s no reason to stop now, is there? Let’s keep our connection and do a couple of basic timings:

\timing on

CREATE TABLE sensor_log (
  sensor_log_id  SERIAL PRIMARY KEY,
  location       VARCHAR NOT NULL,
  reading        BIGINT NOT NULL,
  reading_date   TIMESTAMP NOT NULL

Time: 7.945 ms

INSERT INTO sensor_log (location, reading, reading_date)
SELECT s.id % 1000, round(random() * 100),
       CURRENT_DATE + INTERVAL '1d' - ((s.id * 10)::TEXT || 's')::INTERVAL
  FROM generate_series(1, 5000000) s(id);

Time: 30981.570 ms (00:30.982)

CREATE INDEX idx_sensor_log_location ON sensor_log (location);

Time: 9339.747 ms (00:09.340)

CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);

Time: 3689.406 ms (00:03.689)

ANALYZE sensor_log;

Time: 247.532 ms

SELECT count(*) FROM sensor_log;


Time: 518.908 ms

SELECT count(*) FROM sensor_log
 WHERE location = '50';


Time: 8.243 ms

SELECT count(*) FROM sensor_log
 WHERE reading_date BETWEEN '2016-01-01' AND '2016-06-01';

(1 row)

Time: 139.237 ms

One weakness of this approach is that the Postgres data exists entirely within this single container. Since containers are intended to be temporary, that’s not great if we want to use the container as a somewhat bulky linked binary. We want to be able to kill—and even erase—our container in some cases, yet retain the data.

How do we manage that?

Sharing the Wealth

One way to do this with Docker is volumes. We can build a volume and start another container that uses it like this:

$> docker volume create --name pgdata
$> docker run --name pg-mount -p 5600:5432 -d \
              --volume pgdata:/var/lib/postgresql/data postgres

We created a pgdata volume, then used told Docker that the volume should be mounted at /var/lib/postgresql/data. This is the default location used by the maintainers of the official image, so a new container will use the external volume instead of the built-in storage.

Let’s connect to this new container, make a very small proof-of-concept table, then stop and erase the container. Then we can create a new container that uses the same volume, and see if the table is still there.

$> psql -p 5600 -U postgres -h localhost \
        -c "CREATE TABLE foo (id SERIAL);"

$> docker stop pg-mount
$> docker rm pg-mount
$> docker run --name pg-new-mount -p 5600:5432 -d \
              --volume pgdata:/var/lib/postgresql/data postgres

psql -p 5600 -U postgres -h localhost -c "\d foo"

                            Table "public.foo"
 Column |  Type   | Collation | Nullable |             Default             
 id     | integer |           | not null | nextval('foo_id_seq'::regclass)

Nice, eh? Now we can test various container variants, or image builds, against the same data volume.

Starting from Scratch

But what about performance? Docker containers utilize Copy on Write (CoW) on top of a proprietary filesystem. Is it possible that this abstraction layer could be interfering with our desired use of the hardware?

One of the handy things Docker can do is mount locations on the host machine. Let’s start another image by mapping to a drive on the Docker host, and see if the timings change at all.

We start by launching a new Docker container with the desired mount. The cool thing about the --volume flag is that it will create the directory for us if it doesn’t already exist. It’ll even give it the correct permissions, since Docker itself runs as a management daemon under the root user, and not the postgres user we’re probably accustomed to.

$> docker run --name pg-local -p 5700:5432 -d \
              --volume /db/docker:/var/lib/postgresql/data postgres

$> ls -l /db

drwx------ 19 vboxadd  root  4096 Oct 13 17:09 docker

Now let’s connect to post 5700 and repeat our sensor_log timings:

\timing on

CREATE TABLE sensor_log (
  sensor_log_id  SERIAL PRIMARY KEY,
  location       VARCHAR NOT NULL,
  reading        BIGINT NOT NULL,
  reading_date   TIMESTAMP NOT NULL

Time: 7.977 ms

INSERT INTO sensor_log (location, reading, reading_date)
SELECT s.id % 1000, round(random() * 100),
       CURRENT_DATE + INTERVAL '1d' - ((s.id * 10)::TEXT || 's')::INTERVAL
  FROM generate_series(1, 5000000) s(id);

Time: 30180.467 ms (00:30.180)

CREATE INDEX idx_sensor_log_location ON sensor_log (location);

Time: 9279.165 ms (00:09.279)

CREATE INDEX idx_sensor_log_date ON sensor_log (reading_date);

Time: 3268.765 ms (00:03.269)

ANALYZE sensor_log;

Time: 205.313 ms

SELECT count(*) FROM sensor_log;


Time: 483.312 ms

SELECT count(*) FROM sensor_log
 WHERE location = '50';


Time: 7.234 ms

SELECT count(*) FROM sensor_log
 WHERE reading_date BETWEEN '2016-01-01' AND '2016-06-01';

(1 row)

Time: 141.441 ms

It turns out the filesystem mapping is comparable to Docker’s own filesystem, as the differences are statistically insignificant. Of course, now instead of using its own internal filesystem layer, Docker must maintain a mapping to the local filesystem, so it’s hard to say what’s actually going on behind the scenes. We’re also operating in a relatively puny virtual environment with its own storage performance limitations.

Still, a commendable result, as it leaves us the option use standard filesystem paths. LVM? Sure. XFS? Why not. An external SAN device comprised entirely of SSDs? Of course. There are interesting implications here.

Out of Eden

So are containers really just a bloated linked Postgres binary, since we could have just managed a local directory with a standard Postgres install? Yes and no. We’ll definitely explore more of these concepts in later articles, but it’s important to note that containers are meant for chaining and automation.

A standard Postgres install has a certain amount of administration involved. Maybe it’s just the configuration file, or user creation, or package maintenance. There is a lot to building a Postgres server. Since containers themselves are a tiny Postgres server, we don’t really need to do any of that in simplified contexts.

Did the official Postgres package see an update? We can just update to the latest image, stop the existing container, and start a new one. The old container is still there, so reverting is trivial.

We can make our own image with contrib, and plpython, and pglogical, and launch that anywhere and everywhere. We can hand it to developers that don’t know anything about Postgres, and they can have their own database server that mirrors the production environment. And we can do it without an army of resource-intensive and difficult to provision VMs.

This is just the very tiniest tip of the proverbial iceberg. We’ll continue to build our toolkit in future articles, and maybe end up with something that isn’t arguably worse than a standard server environment, just different.


cheval_rouge: (Default)

April 2017

234 5678

Style Credit

Expand Cut Tags

No cut tags