The following Python script extracts the content of an article using Readibility (ported to Python), converts it to voice using the Amazon Polly service and finally sends the audio as a voice note to a given user in Telegram using Telethon (Telegram client for Python).
For running the script you will need to install the following Python packages:
1 2 3 4 |
pip install boto3 pip install awscli pip install readability-lxml pip install telethon |
Also, you will need to create a AWS account. If you already have an AWS account, make sure that you have a user created in the IAM Management Console with the following permissions:

When creating this user, make sure you write down its ID access key
and its secret access key
. You will need them to configure the aws-cli
client.
With this Amazon credentials, you can configure the AWS client by executing the following command:
1 |
aws configure |
In this step, you will need to fulfill the details with the user credentials you wrote down when creating it.
Now, you will need to create a Telegram API ID. For this, you can go to the Telegram section «Create an Application«. After following the steps described in the official documentation, you will obtain an API ID (it’s a number) and a API hash (it’s a string).
With these steps already completed, you can place all the needed details in the script and run it.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81 82 83 84 85 86 87 88 89 90 91 92 93 94 95 96 97 98 99 100 101 102 103 104 105 106 107 108 109 110 111 112 |
import boto3 import textwrap import requests import re from readability import Document from telethon import TelegramClient, events, sync import time import os #-------------------------------------------------------------------- # Configuration #-------------------------------------------------------------------- ##### Article ##### # Define URL url = "PLACE_THE_URL_HERE" ##### Telegram ##### # Telegram API credentials api_id = PLACE_API_ID_HERE api_hash = 'PLACE_API_HASH_HERE' # Telegram user to send messages telegram_user = "PLACE_TELEGRAM_USER_HERE" # Create Telegram client telegram_client = TelegramClient('session_name', api_id, api_hash) telegram_client.start() ##### AWS configuration ##### # Get S3 session session = boto3.Session(profile_name='default') # Get polly client polly = session.client('polly') # Create a S3 client to retrieve the file content s3 = session.client('s3') # Define bucket name bucket_name = 'PLACE_BUCKET_NAME_HERE' #-------------------------------------------------------------------- # Get HTML from the article #-------------------------------------------------------------------- # Get HTML response = requests.get(url) # Extract content with Readibility doc = Document(response.text) # Get article body html_text = doc.summary() # Regular expression to identify HTML tags, e.g.: html_tag_re = r"<\\?[^>]+>" # Remove HTML tags from the article body text_only = re.sub( html_tag_re, "", html_text,) # Send message to user pointing to the URL that is going to be converted telegram_client.send_message(telegram_user, "Converting: %s" % url) #-------------------------------------------------------------------- # Convert text to voice #-------------------------------------------------------------------- # Start Polly task to save in a Bucket resp = polly.start_speech_synthesis_task(OutputFormat='mp3', OutputS3BucketName=bucket_name, Text=text_only, VoiceId='Enrique') # Get Polly task task = polly.get_speech_synthesis_task(TaskId=resp['SynthesisTask']['TaskId']) # Monitor task status until it is completed while task['SynthesisTask']['TaskStatus'] != 'completed': # Wait 2 seconds between server poll time.sleep(2) # Get Polly task task = polly.get_speech_synthesis_task(TaskId=task['SynthesisTask']['TaskId']) # Print the status of the task print("Task status: %s" % task['SynthesisTask']['TaskStatus']) print("Task completed!") #-------------------------------------------------------------------- # Retrieve file and send to Telegram user #-------------------------------------------------------------------- # Regular expression to extract the key (file name) of the synthesized file key_re = r'/([0-9A-Za-z-.]+)$' # Search the regular expression in the OutputUri regex_search = re.search(key_re, task['SynthesisTask']['OutputUri']) # Take only the first group of the re (key) file_key = regex_search.group(1) # Get file name to store in local title_sanitized = doc.short_title().replace('"', '') title_sanitized = title_sanitized.replace(':', '') file_name = "%s.mp3" % title_sanitized # Download file from the bucket and store it in a MP3 local file with open(file_name, 'wb') as data: s3.download_fileobj(bucket_name, file_key, data) # Delete remote bucket file s3.delete_object(Bucket=bucket_name,Key=file_key) # Delete local file if os.path.isfile(file_key): os.remove(file_key) # Send MP3 file as a voice note to the telegram user telegram_client.send_file(telegram_user, file_name, voice_note=True) # Send signature telegram_client.send_message(telegram_user, "Message sent from Romancero bot.") |
The user you specified will receive a message like this: