Try to make capture software with as high accuracy as possible with python (1)

Introduction

OS:windows10 64bit python:3.7

There are tons of software out there that can record, Recording the screen of a specific website at a specified time Because there was no software that could meet the super personal demand I will write it in python as if it were a playful brain training.

It's not as difficult as writing articles separately, Due to time constraints, I will write in different parts.

In addition, various problems that occurred while working, It's very verbose because I'm writing about how to deal with it. Please note.

policy

As a whole, let's assume the following flow.

    1. Command execution at the specified time
  1. Start the browser with the preset URL
    1. Capture browser screenshots and audio
  2. Video output by merging browser screenshots and audio

This page considers 3. It feels like a quick check of python's famous libraries I couldn't find anything that can capture both image and audio at the same time. (Maybe if you look for git. I'd appreciate it if you could tell me.)

I think it's easy to do everything with ffmpeg, This time I will try it under the meaningless binding of executing it on python code.

Audio capture

I will try using the famous pyaudio.

Install pyaudio

Depending on the person, my hand may stop, so I will write it as a memo. When installing pyaudio, the following error occurred in my environment.

src/_portaudiomodule.c(29): fatal error C1083: include
Unable to open the file.'portaudio.h':No such file or directory

It seems that the build tool is failing. https://minato86.hatenablog.com/entry/2019/04/04/005929#portaudiohNo-such-file-or-directory Refer to the above link and install from the whl file that matches your PC.

pip install PyAudio-0.2.11-cp37-cp37m-win_amd64.whl

My execution environment is pycham. I'm running in a python virtual environment. Even if you execute the above command, it will be on the C drive It will only be installed on the installed python.

D:\~(Omission)~\venv\Scripts\activate.bat

After activating the virtual environment by hitting activate.bat from the command prompt, In that state, I hit the above command to install it.

Implemented with pyaudio

Now that the installation is complete, let's record it.

A program that records with Python 3 and writes it to a wav file
https://ai-trend.jp/programming/python/voice-record/

I don't know what kind of library it is, so I didn't plug in any microphone for the time being Let's run the recording sample source on this site as it is.

OSError: [Errno -9998] Invalid number of channels

It stopped with an error. This error is due to the absence of a recording input device. I'm angry, "What are you trying to record without a microphone?" The input_device_index specified in audio.open seems to be incorrect. Let's check the audio device in your computer

import pyaudio
audio = pyaudio.PyAudio()
for i in range(audio.get_device_count()):
    dev = audio.get_device_info_by_index(i)
    print('name', dev['name'])
    print('index', dev['index'])
0 Microsoft Sound Mapper- Output, MME (0 in, 2 out)
<  1 BenQ GL2460 (NVIDIA High Defini, MME (0 in, 2 out)
   2 Realtek Digital Output (Realtek, MME (0 in, 2 out)
   5 Realtek Digital Output (Realtek High Definition Audio), Windows
DirectSound (0 in, 2 out)
   6 Realtek Digital Output (Realtek High Definition Audio), Windows WASAPI
(0 in, 2 out)
   ...
   ...
   ...
10 stereo mixer(Realtek HD Audio Stereo input), Windows WDM-KS (2 in, 0 out)
12 microphone(Realtek HD Audio Mic input), Windows WDM-KS (2 in, 0 out)

I've been there a lot. This is not the case because BenQ and Realtek Digital Output are output devices, or speakers. You can find a microphone as an input device, but This is likely to record the outside voice.

Which one is likely to input the sound inside the computer?

10 stereo mixer(Realtek HD Audio Stereo input), Windows WDM-KS (2 in, 0 out)

The stereo mixer is a function that gives the sound inside the computer as input sound. Here, let's select "Stereo Mixer" and execute it. (It seems that you can do it with WASAPI, but ignore it here) Enter the stereo mixer number in the device index and try running.

[Errno -9999] Unanticipated host error

Also an error. I feel that it also refers to the settings on the OS side. Let's check the settings on the OS side

https://ahiru8usagi.hatenablog.com/entry/Windows10_Recording Sound Control Panel-> Recording Tab-> Enable Stereo Mixer, Select "Sound"-> "Input" in windows for the stereo mixer.

So I was able to record safely. (The code is in the next chapter)

Screen capture

Let's take a screenshot with ImageGrab.

pip install Pillow

Simultaneous recording and audio capture

Finally the code (prototype). This code has some problems, but I will post it as a progress.

import cv2
import numpy as np
from PIL import ImageGrab
import ctypes
import time
import pyaudio
import wave

#Save start time

# parentTime = time.time()
# for i in range(10):
#     img_cv = np.asarray(ImageGrab.grab())
# current = time.time()
# diff = (current - parentTime)
# print("fps:" + str(float(10)/diff))


user32 = ctypes.windll.user32
capSize = (user32.GetSystemMetrics(0), user32.GetSystemMetrics(1))

fourcc = cv2.VideoWriter_fourcc(*"DIVX")
writer = cv2.VideoWriter("test.mov", fourcc, 30, capSize)
count = 0
FirstFlag = True

WAVE_OUTPUT_FILENAME = "test.wav"
RECORD_SECONDS = 40

FORMAT = pyaudio.paInt16
CHANNELS = 1
RATE = 44100
CHUNK = 2 ** 11
audio = pyaudio.PyAudio()
stream = audio.open(format=FORMAT,
                    channels=CHANNELS,
                    rate=RATE,
                    input=True,
                    input_device_index=0,
                    frames_per_buffer=CHUNK)

frames = []

# #Save start time
# sTime = time.time()
# count = 0

print ("start")
for i in range(0, int(RATE / CHUNK * RECORD_SECONDS)):
    # count+=1
    # if count == 30 :
    #     current = time.time()
    #     diff = (current - sTime)
    #     print("fps:" +  str(float(count)/diff))
    #     sTime = time.time()
    #     count = 0

    #Image capture
    img_cv = np.asarray(ImageGrab.grab())
    img = cv2.cvtColor(img_cv, cv2.COLOR_BGR2RGB)
    writer.write(img)

    #Audio capture
    data = stream.read(CHUNK)
    frames.append(data)
print ("finish")


writer.release()
stream.stop_stream()
stream.close()
audio.terminate()

waveFile = wave.open(WAVE_OUTPUT_FILENAME, 'wb')
waveFile.setnchannels(CHANNELS)
waveFile.setsampwidth(audio.get_sample_size(FORMAT))
waveFile.setframerate(RATE)
waveFile.writeframes(b''.join(frames))
waveFile.close()

Code to save video (mov) and audio (wav) separately. The recording time is 40 seconds because it is the number of seconds specified in RECORD_SECONDS. Well, when I try to do this, the result is. .. ..

Audio file is 39 seconds
Video file is 28 seconds

Audio files aside The video file gave a very suspicious result.

writer = cv2.VideoWriter("test.mov", fourcc, 30, capSize)

I set the setting when saving the video to 30 as appropriate, but This value seems to be inappropriate. Let's roughly calculate how many fps are.

    if count == 30 :
        current = time.time()
        diff = (current - sTime)
        print("fps:" +  str(float(count)/diff))
        sTime = time.time()
        count = 0

The result is about fps14-19. Calculation that 14 to 19 images are output per second. What you can imagine as the processing done in the contents is, For frame images that come at 0.06 second intervals Because the video was output as if it came at 0.03 second intervals, It seems that the video was fast forwarded and the time was short.

writer = cv2.VideoWriter("test.mov", fourcc, 15, capSize)

Change to fps15 and frames that are too late or too early You may be able to solve the problem of time by skipping and writing.

But as a problem before that ... it's slow. Even if it can be recorded, it seems to be rattling.

What is the cause of the process? Let's try to speed up each process.

#ImageGrab alone
parentTime = time.time()
for i in range(40):
    img = ImageGrab.grab()
current = time.time()
diff = (current - parentTime)
print("fps:" + str(float(40)/diff))

#ImageGrab+numpy
parentTime = time.time()
for i in range(40):
    img_cv = np.asarray(ImageGrab.grab())
current = time.time()
diff = (current - parentTime)
print("fps:" + str(float(40)/diff))

As a result

ImageGrab.grab()27fps on its own,
ImageGrab.grab()20fps when converting from to numpy.
ImageGrab.grab()18fps with conversion from to numpy and RGB conversion.

I'm not saying that ImageGrab.grab () is slow, but it's a bit of a problem when considering post-processing.

You can improve each conversion process, Try to find something faster than ImageGrab.grab () that captures the image.

https://stackoverrun.com/ja/q/3379877

Refer to the above site and try to output with windows api.

# windows_api
import win32gui, win32ui, win32con, win32api
hwin = win32gui.GetDesktopWindow()
width = 1920
height = 1080
left = win32api.GetSystemMetrics(win32con.SM_XVIRTUALSCREEN)
top = win32api.GetSystemMetrics(win32con.SM_YVIRTUALSCREEN)
hwindc = win32gui.GetWindowDC(hwin)
srcdc = win32ui.CreateDCFromHandle(hwindc)
memdc = srcdc.CreateCompatibleDC()
bmp = win32ui.CreateBitmap()
bmp.CreateCompatibleBitmap(srcdc, 1920, 1080)
memdc.SelectObject(bmp)

parentTime = time.time()
arr = []
for i in range(30):
    memdc.BitBlt((0, 0), (width, height), srcdc, (left, top),
win32con.SRCCOPY)
    arr.append(memdc)
current = time.time()
diff = (current - parentTime)
print("fps:" + str(float(30)/diff))
fps:48.752326144998015

fps48. It is a detonation velocity. It seems to be interesting, so I will fix it with the policy of using this api.

I've run out of time so next time

Recommended Posts

Try to make capture software with as high accuracy as possible with python (1)
Try to make a capture software with as high accuracy as possible with python (2)
Try to make a "cryptanalysis" cipher with Python
Try to make a dihedral group with Python
Try to make a command standby tool with python
Try to operate Facebook with Python
How to make Selenium as light as possible
Try logging in to qiita with Python
Fractal to make and play with Python
Try to make foldl and foldr with Python: lambda. Also time measurement
Try to solve a set problem of high school math with Python
Try to make your own AWS-SDK with bash
Try to solve the man-machine chart with Python
Try to draw a life curve with python
I want to make a game with Python
Try to automatically generate Python documents with Sphinx
Try to make client FTP fastest with Pythonista
Try to detect fish with python + OpenCV2.4 (unfinished)
I tried to implement merge sort in Python with as few lines as possible
Try to make BOT by linking spreadsheet and Slack with python 2/2 (python + gspread + slackbot)
Try to make BOT by linking spreadsheet and Slack with python 1/2 (python + gspread + slackbot)
WEB scraping with python and try to make a word cloud from reviews
Try scraping with Python.
Try to solve the programming challenge book with python3
[First API] Try to get Qiita articles with Python
Try to make a Python module in C language
Explain in detail how to make sounds with python
Try to operate DB with Python and visualize with d3
Try to make RESTful API with MVC using Flask 1.0.2
Try to automate pdf format report creation with Python
Try to extract high frequency words using NLTK (python)
Try to make it using GUI and PyQt in Python
I tried to make various "dummy data" with Python faker
Try to make PC setting change software using TKinter (beginner)
The first API to make with python Djnago REST framework
[AWS] Try adding Python library to Layer with SAM + Lambda (Python)
Experiment to make a self-catering PDF for Kindle with Python
Try to bring up a subwindow with PyQt5 and Python
Try to automate the operation of network devices with Python
Creating a GUI as easily as possible with python [tkinter edition]
I tried to make GUI tic-tac-toe with Python and Tkinter
It's Halloween so I'll try to hide it with Python
Try to decipher the garbled attachment file name with Python
Try to factorial with recursion
Connect to BigQuery with Python
Camera capture with Python + OpenCV
Try to understand Python self
Try Python output with Haxe 3.2
Connect to Wikipedia with Python
Post to slack with Python 3
Easy to make with syntax
Make Puyo Puyo AI with Python
Try running Python with Try Jupyter
Switch python to 2.7 with alternatives
Write to csv with Python
Try face recognition with Python
I tried to make my own high school girl BOT with Rinna style with LINE BOT (Python & Heroku)
[5th] I tried to make a certain authenticator-like tool with python
Made it possible to convert PNG to JPG with Pillow of Python
Rubyist tried to make a simple API with Python + bottle + MySQL
[2nd] I tried to make a certain authenticator-like tool with python