tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches打賞
一、總結(jié)
一句話總結(jié):
保證batch_size(圖像增強中)*steps_per_epoch(fit中)小于等于訓(xùn)練樣本數(shù)
train_generator = train_datagen.flow_from_directory( train_dir, # 目標(biāo)目錄 target_size=(150, 150), # 將所有圖像的大小調(diào)整為 150×150 batch_size=20, # 因為使用了 binary_crossentropy 損失,所以需要用二進(jìn)制標(biāo)簽 class_mode='binary')
history = model.fit( train_generator, steps_per_epoch=100, epochs=150, validation_data=validation_generator, validation_steps=50)
# case 1 # 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=100,那么會報錯 """ tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches). You may need to use the repeat() function when building your dataset. """ # 因為train樣本數(shù)是2000(貓1000,狗1000),小于100*32 # case 2 # 如果上面train_generator的batch_size是20,如果這里steps_per_epoch=100,那么不會報錯 # 因為大小剛好 # case 3 # 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=int(1000/32), # 那么不會報錯,但是會有警告,因為也是不整除 # 不會報錯因為int(1000/32)*32 < 2000 # case 4 # 如果上面train_generator的batch_size是40,如果這里steps_per_epoch=100,照樣報錯 # 因為40*100>2000
二、tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches
轉(zhuǎn)自或參考:https:///questions/60509425/how-to-use-repeat-function-when-building-data-in-keras
1、報錯
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least steps_per_epoch * epochs batches (in this case, 5000 batches). You may need to use the repeat() function when building your dataset.
2、現(xiàn)象
I am training a binary classifier on a dataset of cats and dogs: Total Dataset: 10000 images Training Dataset: 8000 images Validation/Test Dataset: 2000 images
The Jupyter notebook code:
# Part 2 - Fitting the CNN to the images
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('dataset/training_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('dataset/test_set',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
history = model.fit_generator(training_set,
steps_per_epoch=8000,
epochs=25,
validation_data=test_set,
validation_steps=2000)
I trained it on a CPU without a problem but when I run on GPU it throws me this error:
Found 8000 images belonging to 2 classes.
Found 2000 images belonging to 2 classes.
WARNING:tensorflow:From <ipython-input-8-140743827a71>:23: Model.fit_generator (from tensorflow.python.keras.engine.training) is deprecated and will be removed in a future version.
Instructions for updating:
Please use Model.fit, which supports generators.
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
WARNING:tensorflow:sample_weight modes were coerced from
...
to
['...']
Train for 8000 steps, validate for 2000 steps
Epoch 1/25
250/8000 [..............................] - ETA: 21:50 - loss: 7.6246 - accuracy: 0.5000
WARNING:tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 200000 batches). You may need to use the repeat() function when building your dataset.
250/8000 [..............................] - ETA: 21:52 - loss: 7.6246 - accuracy: 0.5000
I would like to know how to use the repeat() function in keras using Tensorflow 2.0?
3、解決
Your problem stems from the fact that the parameters steps_per_epoch and validation_steps need to be equal to the total number of data points divided to the batch_size.
Your code would work in Keras 1.X, prior to August 2017.
Change your model.fit function to:
history = model.fit_generator(training_set,
steps_per_epoch=int(8000/batch_size),
epochs=25,
validation_data=test_set,
validation_steps=int(2000/batch_size))
As of TensorFlow2.1, fit_generator is being deprecated. You can use .fit() method also on generators.
TensorFlow >= 2.1 code:
history = model.fit(training_set.repeat(),
steps_per_epoch=int(8000/batch_size),
epochs=25,
validation_data=test_set.repeat(),
validation_steps=int(2000/batch_size))
Notice that int(8000/batch_size) is equivalent to 8000 // batch_size (integer division)
============================================================================
也就是steps_per_epoch=int(8000/batch_size),這里的8000是訓(xùn)練樣本數(shù)
4、實例
訓(xùn)練樣本為1000張,
train_datagen = ImageDataGenerator(
rescale=1./255,
rotation_range=40,
width_shift_range=0.2,
height_shift_range=0.2,
shear_range=0.2,
zoom_range=0.2,
horizontal_flip=True,)
# 注意,不能增強驗證數(shù)據(jù)
test_datagen = ImageDataGenerator(rescale=1./255)
# 這里batch_size不能是32,不然就報如下錯誤
'''
WARNING:tensorflow:Your input ran out of data;
interrupting training. Make sure that your dataset or generator can generate at least
steps_per_epoch * epochs batches (in this case, 5000 batches).
You may need to use the repeat() function when building your dataset.
'''
# 可能是整除關(guān)系吧
train_generator = train_datagen.flow_from_directory(
train_dir, # 目標(biāo)目錄
target_size=(150, 150), # 將所有圖像的大小調(diào)整為 150×150
batch_size=20, # 因為使用了 binary_crossentropy 損失,所以需要用二進(jìn)制標(biāo)簽
class_mode='binary')
validation_generator = test_datagen.flow_from_directory(
validation_dir,
target_size=(150, 150),
batch_size=20,
class_mode='binary')

history = model.fit(
train_generator,
steps_per_epoch=100,
epochs=150,
validation_data=validation_generator,
validation_steps=50)
如果上面的batch_size=32,那么這里如果steps_per_epoch=100會報錯
steps_per_epoch 參數(shù)的作用:從生成器中抽取 steps_per_epoch 個批量后(即運行了 steps_per_epoch 次梯度下降),擬合過程 將進(jìn)入下一個輪次。
4.1、具體測試情況
# case 1 # 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=100,那么會報錯 """ tensorflow:Your input ran out of data; interrupting training. Make sure that your dataset or generator can generate at least `steps_per_epoch * epochs` batches (in this case, 50 batches). You may need to use the repeat() function when building your dataset. """ # 因為train樣本數(shù)是2000(貓1000,狗1000),小于100*32 # case 2 # 如果上面train_generator的batch_size是20,如果這里steps_per_epoch=100,那么不會報錯 # 因為大小剛好 # case 3 # 如果上面train_generator的batch_size是32,如果這里steps_per_epoch=int(1000/32), # 那么不會報錯,但是會有警告,因為也是不整除 # 不會報錯因為int(1000/32)*32 < 2000 # case 4 # 如果上面train_generator的batch_size是40,如果這里steps_per_epoch=100,照樣報錯 # 因為40*100>2000
5、具體代碼
|